CD109 nucleic acid molecules, polypeptides and methods of use

ABSTRACT

The invention is a CD109 nucleic acid molecule and its corresponding polypeptide. The invention also includes biologically functional equivalent nucleic acid molecules and polypeptides. The invention also relates to methods of using these nucleic acid sequences and polypeptides in medical diagnosis and treatment and in drug screening.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a National Stage application based on InternationalApplication No. PCT/CA02/00292, filed Mar. 7, 2002, which claimspriority to U.S. Provisional Application No. 60/273,814, filed Mar. 7,2001, the disclosures of which are incorporated herein by reference.

FIELD OF THE INVENTION

The invention relates to isolated nucleic acid molecules encoding CD109polypeptides, and the CD109 polypeptides themselves. The invention alsoincludes methods of using the polypeptides and nucleic acid moleculesand proteins for treatment of diseases, disorders and abnormal physicalstates.

BACKGROUND OF THE INVENTION

CD109 is a cell surface antigen that marks primitive progenitor andhematopoietic stem cells and activated platelets and T lymphocytes. Todate, the function of CD109 in these cell types has remained largelyunknown. While T cell CD109 has previously been implicated in theregulation of antibody inducing T helper cell function, but its role ispoorly understood.

To date, no one has been able to isolate and sequence CD109 DNA orprotein. There are many reasons for these problems including the verylow levels of CD109 gene expression and the corresponding low level ofprotein expression and activity. Moreover the instability of CD109 inprotein extracts, and its association with plasma membranes hascontributed to the difficulties surrounding its isolation. Without theDNA and protein sequences, it is impossible to design rationalstrategies for the modulation of CD109 levels or activity, by modulatinggene and protein expression. There is a need to identify these sequencesin order to establish methods to modulate CD109 activity. Proteinsequence information is important for the elucidation of proteinstructure and the ultimate design of chemical inhibitors that maymodulate CD109 activity in vivo. There is also a need for cells whichoverexpress CD109 polypeptides or in which gene expression is reduced orblocked.

SUMMARY OF THE INVENTION

The invention relates to isolated and characterized human CD109 nucleicacid molecules and the corresponding polypeptides. CD109 is a novelmember of the α2 macroglobulin (α2M)/C3, C4, C5 family ofthioester-containing proteins. Analysis of sequences shows that specificproteolytic cleavage of CD109 results in activation of its thioester.The chemical reactivity of the activated CD109 thioester likely issimilar to that of complement rather than that of α2M proteins, withreactivity being directed preferentially towards hydroxyl-containingcarbohydrate and protein nucleophiles. Activated CD109 is capable ofcovalent binding to a variety of substrates, including cell membranes.In addition, the t_(1/2), of the activated CD109 thioester is extremelyshort, so that CD109 action is spatially restricted to the site of itsactivation.

The invention relates to isolated nucleic acid molecules encoding CD109polypeptides. The molecule preferably encodes a polypeptide including athioester region which becomes reactive towards a nucleophile when thepolypeptide is cleaved. Another aspect of the invention relates to anisolated nucleic acid molecule encoding a CD109 polypeptide, a fragmentof a CD109 polypeptide having CD109 activity, or a polypeptide havingCD109 activity, comprising a nucleic acid molecule selected from thegroup consisting of:

-   -   (a) a nucleic acid molecule that hybridizes to a nucleic acid        molecule consisting of at least one of [SEQ ID NO:1, 3, 5, 7, 9        or 11], or a complement thereof under low, moderate or high        stringency hybridization conditions wherein the nucleic acid        molecule encodes a CD109 polypeptide or a polypeptide having        CD109 activity;    -   (b) a nucleic acid molecule degenerate with respect to (a),        wherein the nucleic molecule encodes a CD109 polypeptide or a        polypeptide having CD109 activity.

The hybridization conditions optionally comprise low stringencyconditions of 1×SSC, 0.1% SDS at 5000 or high stringency conditions of0.1×SSC, 0.1% SDS at 65° C.

Another aspect of the invention relates to an isolated nucleic acidmolecule encoding a CD109 polypeptide, a fragment of a CD109 polypeptidehaving CD109 activity, or a polypeptide having CD109 activity,comprising a nucleic acid molecule selected from the group consistingof:

-   -   (a) the nucleic acid molecule of the coding strand shown in [SEQ        ID NO:1, 3, 5, 7, 9 or 11], or a complement thereof;    -   (b) a nucleic acid molecule encoding the same amino acid        sequence as a nucleotide sequence of (a); and    -   (c) a nucleic acid molecule having at least 17% identity with        the nucleotide sequence of (a) and which encodes a CD109        polypeptide or a polypeptide having CD109 activity.

The CD109 polypeptide optionally comprises a K1 [SEQ ID NO:2], K1-H7[SEQ ID NO:6] or K15 [SEQ ID NO:10] polypeptide or their variants [SEQID NO:4, 8 or 12]. The nucleic acid molecule of the invention,optionally comprises all or part of a nucleotide sequence shown in [SEQID NO:1, 3, 5, 7, 9 or 11] or a complement thereof. The nucleic acidmolecule of the invention, optionally consist of the nucleotide sequenceshown in [SEQ ID NO: 1, 3, 5, 7, 9, or 11] or a complement thereof. TheCD109 nucleic acid molecule or a fragment thereof is optionally isolatedfrom a human. The nucleic acid molecule of the invention optionallycomprises genomic DNA, cDNA or RNA. The nucleic acid molecule of theinvention, wherein the nucleic acid molecule is optionally chemicallysynthesized.

In another aspect, the invention relates to an isolated nucleic acidmolecule comprising a nucleic acid molecule selected from the groupconsisting of 8 to 10 nucleotides of the nucleic acid molecule of theinvention or a region shown in Table 1.1, 1.2 or 1.3. The nucleic acidmolecule of the invention, optionally comprises at least 30 consecutivenucleotides of [SEQ ID NO: 1, 3, 5, 7, 9, or 11] or a complementthereof.

Another embodiment of the invention relates to a recombinant nucleicacid molecule comprising a nucleic acid molecule of the invention and aconstitutive promoter sequence or an inducible promoter sequenceoperatively linked so that the promoter enhances transcription of thenucleic acid molecule in a host cell.

Another embodiment of the invention relates to a vector comprising anucleic acid molecule of the invention. The vector optionally comprisesa promoter selected from the group consisting of a vav promoter, a H2Kpromoter, a PF4 promoter, a GP1b promoter, a lck promoter, a CD2promoter, a granzymeB promoter, a Beta actin promoter, a PGK promoter, aCMV promoter, a retroviral LTR, a metallothionenin IIA promoter, anecdysone promoter and a tetracycline inducible promoter.

Another embodiment of the invention relates to a host cell comprisingthe recombinant nucleic acid molecule of the invention, or progeny ofthe host cell. The host cell is optionally selected from the groupconsisting of a mammalian cell, a fungal cell, a yeast cell, a bacterialcell, a microorganism cell and a plant cell.

Another embodiment of the invention relates to an isolated polypeptideencoded by and/or produced from a nucleic acid molecule of theinvention. The invention includes an isolated CD109 polypeptide or afragment thereof having CD109 activity. The polypeptide of the inventionoptionally comprises all or part of an amino acid sequence in [SEQ IDNO:2, 4, 6, 8, 10 or 12]. The polypeptide optionally comprises ten ormore, or ten or fewer consecutive residues of [SEQ ID NO:2, 4, 6, 8, 10or 12].

The invention also includes an isolated immunogenic polypeptide, theamino acid sequence of which comprises ten or more, or ten or fewerconsecutive residues of [SEQ ID NO:2, 4, 6, 8, 10 or 12]. The isolatedpolypeptide optionally comprises a region shown in Table 1a or Table 1b.

The invention includes a polypeptide fragment of a polypeptide of theinvention or a peptide mimetic of said polypeptide. The polypeptidefragment optionally consists or comprises of 20 or more or 20 or feweramino acids, which fragment has CD109 activity. The fragment or peptidemimetic is optionally capable of being bound by an antibody to thepolypeptide of the invention. The polypeptide is optionallyrecombinantly produced.

The invention includes an isolated and purified polypeptide comprisingthe amino acid sequence of a CD109 polypeptide, wherein the polypeptideis encoded by a nucleic acid molecule that hybridizes under moderate orstringent conditions to a nucleic acid molecule in [SEQ ID NO:1, 3, 5,7, 9, or 11], a degenerate form thereof or a complement.

The invention also includes a polypeptide comprising a sequence havinggreater than 20% sequence identity to a polypeptide of the invention.The polypeptide of the invention optionally comprises a CD109polypeptide, such as a polypeptide isolated from a human cell. Thepolypeptide optionally comprises a region including at least 30%homology to a region shown in Table 1a or Table 1b. The inventionincludes an isolated nucleic acid molecule encoding a polypeptide of theinvention.

The invention includes a CD109 specific antibody targeted to a regionselected from the CD109 bait region, the CD109 thioester, or the CD109thioester reactivity defining hexapeptide. The antibody optionallycomprises a monoclonal antibody or a polyclonal antibody.

Another aspect of the invention relates to a pharmaceutical composition,comprising all or part of a polypeptide of the invention or a mimeticthereof, and a pharmaceutically acceptable carrier.

Another aspect of the invention relates to a pharmaceutical compositionfor use in gene therapy, comprising all or part of a nucleotide sequenceof the invention, and a pharmaceutically acceptable carrier, auxiliaryor excipient. Another variant of the invention relates to apharmaceutical composition for use in gene therapy, comprising all orpart of an antisense sequence to all or part of the nucleic acidsequence in [SEQ ID NO: 1, 3, 5, 7, 9, or 11].

Another aspect of the invention relates to a kit for the treatment ordetection of a disease, disorder or abnormal physical state, comprisingall or part of a nucleotide sequence of the invention. A kit for thetreatment or detection of a disease, disorder or abnormal physicalstate, optionally comprises all or part of the polypeptide of theinvention. A kit for the treatment or detection of a disease, disorderor abnormal physical state, optionally comprises an antibody to apolypeptide of the invention. The kit is useful in relation the disordersuch as one selected from a group consisting of conditions associatedwith endothelial activation, platelet activation, activation of thecoagulation or fibrinolytic systems, activation of T lymphocytes and ofthe complement system, cardiovascular disorders, stroke, myocardialinfarction, thrombosis, embolism, peripheral vascular disease, disordersassociated with quantitative or qualitative abnormalities of plateletfunction, thrombocytopenia, thrombocythemia, conditions associated withincreased or impaired platelet aggregation and activation, conditionsassociated with increased or impaired activation of the coagulationand/or fibrinolytic systems, conditions associated with impaired orincreased immune activation, autoimmune diseases, organ transplantationand bone marrow transplantation.

Another aspect of the invention relates to a method of medical treatmentof a disease, disorder or abnormal physical state, characterized byexcessive CD109 expression, concentration or activity, comprisingadministering a product that reduces or inhibits CD109 polypeptideexpression, concentration or activity. The product is optionally anantisense nucleotide sequence to all or part of the nucleotide sequenceof [SEQ ID NO. 1, 3, 5, 7, 9 or 11], the antisense nucleotide sequencebeing sufficient to reduce or inhibit CD109 polypeptide expression. Theantisense DNA is optionally administered in a pharmaceutical compositioncomprising a carrier and a vector operably linked to the antisense DNA.The invention also relates to a method of medical treatment of adisease, disorder or abnormal physical state, characterized byinadequate CD109 expression, concentration or activity, comprisingadministering a product that increases CD109 polypeptide expression,concentration or activity. The product is optionally a nucleotidesequence comprising all or part of the nucleotide sequence of [SEQ IDNO. 1, 3, 5, 7, 9, or 11], the DNA being sufficient to increase CD109polypeptide expression. The nucleotide sequence is optionallyadministered in a pharmaceutical composition comprising a carrier and avector operably linked to the nucleotide sequence. The invention alsoincludes a method of medical treatment of a disease, disorder orabnormal physical state having normal CD109 expression, concentrationand activity, comprising administering a product that increases orreduces CD109 polypeptide expression, concentration or activity.

BRIEF DESCRIPTION OF THE DRAWINGS

Preferred embodiments of the invention will be described in relation tothe drawings in which:

FIG. 1 a. shows [SEQ ID NO:1]. In a preferred embodiment, the sequenceis a CD109 K1 cDNA.

FIG. 1 b. shows [SEQ ID NO:3]. In a preferred embodiment, the sequenceis a variant of CD109 K1 cDNA.

FIG. 2 a. shows [SEQ ID NO:5]. In a preferred embodiment, the sequenceis called K1-H7 cDNA.

FIG. 2 b. shows [SEQ ID NO:7]. In a preferred embodiment, the sequenceis a variant of CD109 K1-H7 cDNA.

FIG. 3 a. shows [SEQ ID NO:2 and SEQ ID NO:6]. In a preferredembodiment, these sequences are the 1445 aa protein sequences producedfrom K1 and K1-H7 CD109 cDNAs respectively.

FIG. 3 b. shows [SEQ ID NO:4 and SEQ ID NO:8]. In a preferredembodiment, these sequences are variants of the 1445 aa proteinsequences produced from K1 and K1-H7 variant CD109 cDNAs respectively.

FIG. 4 a. shows [SEQ ID NO9]. In a preferred embodiment, the sequence iscalled the CD109 K15 cDNA.

FIG. 4 b. shows [SEQ ID NO:11]. In a preferred embodiment, this sequenceis variant of CD109 K15 cDNA.

FIG. 5 a. shows [SEQ ID NO:10]. In a preferred embodiment, the sequenceis the K15 amino acid sequence.

FIG. 5 b. shows [SEQ ID NO:12] In a preferred embodiment, this sequenceis a variant of the K15 amino acid sequence.

FIG. 6. shows K1, K1-H7, and the corresponding amino acid sequence (K1and K1-H7 polypeptide sequences are identical). Nucleotides are numberedrelative to the translation initiation codon, with the corresponding aanumbering shown in parentheses. Both K1 and K1-H7 3′UTRs, and thepositions of the corresponding poly(A) tails [(a)_(n))] are shown.Potential sites of N-linked glycosylation, the thioester signaturesequence (aa residues 918-924), and the corresponding downstreamthioester reactivity defining hexapeptide motif (aa residues 1039-1044)are marked by open boxes; solid underline, amino-terminal leaderpeptide; dotted underline, bait region; *, translation stop; opentriangle, GPI anchor cleavage/addition site.

DETAILED DESCRIPTION OF THE INVENTION

“Nucleic acid molecule” includes DNA and RNA, whether single or doublestranded. The term is also intended to include a strand that is amixture of nucleic acid molecules and nucleic acid analogs and/ornucleotide analogs, or that is made entirely of nucleic acid analogsand/or nucleotide analogs.

“Nucleic acid analogue” refers to modified nucleic acids or speciesunrelated to nucleic acids that are capable of providing selectivebinding to nucleic acid molecules or other nucleic acid analogues. Asused herein, the term “nucleotide analogues” includes nucleic acidswhere the internucleotide phosphodiester bond of DNA or RNA is modifiedto enhance bio-stability of the oligomer and “tune” theselectivity/specificity for target molecules (Ulhmann, et al., 1990,Angew. Chem. Int. Ed. Eng., 90: 543; Goodchild, 1990, J. BioconjugateChem., I: 165; Englisch et al., 1991, Angew, Chem. Int. Ed. Eng., 30:613). Such modifications may include and are not limited tophosphorothioates, phosphotriesters, phosphoramidates ormethylphosphonates. The 2′-O-methyl, allyl and 2′-deoxy-2′-fluoro RNAanalogs, when incorporated into an oligomer show increased biostabilityand stabilization of the RNA/DNA duplex (Lesnik et al., 1993,Biochemistry, 32: 7832). As used herein, the term “nucleic acidanalogues” also include alpha anomers (α-DNA), L-DNA (mirror image DNA),2′-5′ linked RNA, branched DNA/RNA or chimeras of natural DNA or RNA andthe above-modified nucleic acids. For the purposes of the presentinvention, any nucleic acid molecule containing a “nucleotide analogue”shall be considered as a nucleic acid molecule. Backbone replacednucleic acid analogues can also be adapted for use as immobilisedselective moieties of the present invention. For purposes of the presentinvention, the peptide nucleic acids (PNAs) (Nielsen et al, 1993,Anti-Cancer Drug Design, 8: 53; Engels et al., 1992, Angew, Chem. Int.Ed. Eng., 31: 1008) and carbamate-bridged morpholino-typeoligonucleotide analogs (Burger, D. R., 1993, J. Clinical Immunoassay,16: 224; Uhlmann, et al., 1993, Methods in Molecular Biology, 20,“Protocols for Oligonucleotides and Analogs,” ed. Sudhir Agarwal, HumanaPress, NJ, U.S.A., pp. 335-389) are also embraced by the term “nucleicacid analogues”. Both exhibit sequence-specific binding to DNA with theresulting duplexes being more thermally stable than the natural DNA/DNAduplex. Other backbone-replaced nucleic acids are well known to thoseskilled in the art and may also be used in the present invention (seee.g., Uhlmann et al 1993, Methods in Molecular Biology, 20, “Protocolsfor Oligonucleotides and Analogs,” ed. Sudhir Agrawal, Humana Press, NJ,U.S.A., pp. 335).

Identification and Characterization of CD109

The invention includes isolated CD109 nucleic acid molecules andpolypeptides. Three preferred sequences are K1 [SEQ ID NO:1 and 2],K1-H7 [SEQ ID NO:5 and 6] and K15 [(SEQ ID NO:9 and 10] and the K1 [SEQID NO:3 and 4], K1-H7 [SEQ ID NO:7 and 6], and K15 [SEQ ID NO:11 and 12]variants. The CD109 is preferably mammalian, and more preferably human.No isolated CD109 nucleic acid molecules or polypeptides were knownprior to this invention. The invention also includes a host celltransformed with a CD109 recombinant nucleic acid molecule and arecombinant isolated CD109 protein. The invention includes therecombinant nucleic acid molecules as well as the vectors includingthese molecules.

The invention includes CD109 nucleic acid molecules and molecules havingsequence identity or which hybridize to the CD109 sequences shown in thefigures (preferred percentages for sequence identity are describedbelow). The invention also includes CD109 or proteins having sequenceidentity (preferred percentages described below) to the sequence shownin the figures. The nucleic acid molecules and proteins of the inventionmay be isolated from a native source, or they may be synthetic orrecombinant. The nucleic acid molecules and polypeptides are optionallypurified so that they are suitable for administration to humans.

Characterization of Nucleic Acid Molecules and Polypeptides

In one variation, the invention includes DNA sequences including atleast one of the sequences shown in the figures in a nucleic acidmolecule of preferably about: less than 1000 base pairs, less than 1250base pairs, less than 1500 base pairs, less than 1750 base pairs, lessthan 2000 base pairs, less than 2250 base pairs, less than 2500 basepairs, less than 2750 base pairs or less than 3000 base pairs.

Regions of the CD109 nucleic acid molecule are as follows:

TABLE 1a Clones K1, K1-H7 Start Nucleotide End Nucleotide [brackets show[brackets show corresponding corresponding Nucleic Acid Molecule aminoacid nos.] amino acid nos.] Coding region only   1 (1) 4335 (1445)thioester signature 2752 (918) 2772 (924) sequence (aa residues 918-924)thioester reactivity defining 3088 (1030) 3105 (1035) hexapeptide motifBait region about 1942 (648) about 2052 (684)

TABLE 1b Clone K15 Start Nucleotide End Nucleotide [brackets show[brackets show corresponding corresponding Nucleic Acid Molecule aminoacid nos.] amino acid nos.] Coding region only   1 (1) 3201 (1067)thioester signature 2752 (918) 2772 (924) sequence (aa residues 918-924)thioester reactivity defining 3088 (1030) 3105 (1035) hexapeptide motifBait region* about 1942 (648) about 2052 (684) *Note that the exact baitregion coordinates are approximate

It will be apparent that these may be varied, for example, by shorteningthe 5′ untranslated region or shortening the nucleic acid molecule sothat the 3′ end nucleotide is in a different position.

The discussion of the nucleic acid molecules, sequence identity,hybridization and other aspects of nucleic acid molecules includedwithin the scope of the invention is intended to be applicable to eitherthe entire nucleic acid molecule in the figure or its coding region. Onemay use the entire molecule or only the coding region. Other possiblemodifications to the sequence are apparent.

We have identified additional CD109 polymorphisms. Based on thenumbering system of the CD109 patent, the 3 polymorphisms are asfollows.

i. codon 792 of [SEQ ID NO:1] att to agt; Ile to Ser (isoleucine toserine) ii. codon 797 of [SEQ ID NO:1] aat to agt; Asn to Ser(asparagine to serine) iii. codon 845 of [SEQ ID NO:1] gtc to atc; Valto Ile (valine to isoleucine)

These polymorphisms optionally occur in each of the sequences of theinvention. The polymorphisms would be expected to modify all of the cDNAand protein sequences (K1, K1 variant, K1-H7, K1-H7 variant, K15, K15variant).

Functionally Equivalent Nucleic Acid Molecules

The term “isolated nucleic acid” refers to a nucleic acid the structureof which is not identical to that of any naturally occurring nucleicacid or to that of any fragment of a naturally occurring genomic nucleicacid spanning more than three separate genes. The term therefore covers,for example, (a) DNA which has the sequence of part of a naturallyoccurring genomic DNA molecule; (b) a nucleic acid incorporated into avector or into the genomic DNA of a prokaryote or eukaryote,respectively, in a manner such that the resulting molecule is notidentical to any naturally occurring vector or genomic DNA; (c) aseparate molecule such as cDNA, a genomic fragment, a fragment producedby reverse transcription of polyA RNA which can be amplified by PCR, ora restriction fragment; and (c) a recombinant nucleotide sequence thatis part of a hybrid gene, i.e., a gene encoding a fusion protein.Specifically excluded from this definition are nucleic acids present inmixtures of (i) DNA molecules, (ii) transfected cells, and (iii) cellclones, e.g., as these occur in a DNA library such as a cDNA or genomicDNA library.

Sequence Identity

This is the first isolation of a nucleic acid molecule encoding a CD109polypeptide from a human. Nucleic acid sequences having sequenceidentity to the K1 [SEQ ID NO:1], K1-H7 [SEQ ID NO:2] or K15 [SEQ IDNO:9] sequence or their variants [SEQ ID NO:3, 7 and 11] are found inother mammals. The invention includes methods of isolating these nucleicacid molecules and polypeptides as well as methods of using thesenucleic acid molecules and polypeptides according to the methodsdescribed in this application.

The invention includes the nucleic acid molecules from other species aswell as methods of obtaining the nucleic acid molecules by, for example,screening a cDNA library or other DNA collection with a probe of theinvention (such as a probe comprising at least about: 10 or preferablyat least 15 or 30 nucleotides of K1 [SEQ ID NO:1], K1-H7 [SEQ ID NO:2]or K15 [SEQ ID NO:9] sequence or their variants [SEQ ID NO:3, 7 and 11]and detecting the presence of a CD109 nucleic acid molecule. Anothermethod involves comparing the K1 [SEQ ID NO:1 to 4], K1-H7 [SEQ ID NO:5to 8] or K15 [SEQ ID NO:9 to 12] sequence to other sequences, forexample using bioinformatics techniques such as database searches oralignment strategies, and detecting the presence of a CD109 nucleic acidmolecule or polypeptide. The invention includes the nucleic acidmolecule and/or polypeptide obtained according to the methods of theinvention. The invention also includes methods of using the nucleic acidmolecules, for example to make probes, in research experiments or totransform host cells. These methods are as described below.

The polypeptides encoded by the CD109 nucleic acid molecules in otherspecies will have amino acid sequence identity to the K1 [SEQ ID NO:2],K1-H7 [SEQ ID NO: 6] or K15 [SEQ ID NO:10] sequence or their variants[SEQ ID NO:4, 8 or 12]. Sequence identity may be at leastabout: >20%, >25%, >28%, >30%, >35%, >40%, >50% to an amino acidsequence shown in the figures (or a partial sequence thereof). Somepolypeptides may have a sequence identity of at leastabout: >60%, >70%, >80% or >90%, more preferably at leastabout: >95%, >99% or >99.5% to an amino acid sequence in the figures (ora partial sequence thereof). Identity is calculated according to methodsknown in the art. Sequence identity (nucleic acid and protein) is mostpreferably assessed by the algorithm of BLAST version 2.1 advancedsearch. BLAST is a series of programs that are available online from theNational Center for Biotechnology Information (NCBI) of the U.S.National Institutes of Health. The advanced blast search is set todefault parameters. (ie Matrix BLOSUM62; Gap existence cost 11; Perresidue gap cost 1; Lambda ratio 0.85 default).

References to BLAST Searches are:

-   Altschul, S. F., Gish, W., Miller, W., Myers, E. W. &    Lipman, D. J. (1990) “Basic local alignment search tool.” J. Mol.    Biol. 215:403_(—)410.-   Gish, W. & States, D. J. (1993) “Identification of protein coding    regions by database similarity search.” Nature Genet. 3:266_(—)272.-   Madden, T. L., Tatusov, R. L. & Zhang, J. (1996) “Applications of    network BLAST server” Meth. Enzymol. 266:131_(—)141.-   Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang,    Z., Miller, W. & Lipman, D. J. (1997) “Gapped BLAST and PSI_BLAST: a    new generation of protein database search programs.” Nucleic Acids    Res. 25:3389_(—)3402.-   Zhang, J. & Madden, T. L. (1997) “PowerBLAST: A new network BLAST    application for interactive or automated sequence analysis and    annotation.” Genome Res. 7:649_(—)656.

The invention also includes modified polypeptides which have sequenceidentity at leastabout: >20%, >25%, >28%, >30%, >35%, >40%, >50%, >60%, >70%, >80%or >90% more preferably at least about >95%, >99% or >99.5%, to a CD109sequence in the figures (or a partial sequence thereof). Modifiedpolypeptide molecules are discussed below. Preferably about: 1, 2, 3, 4,5, 6 to 10, 10 to 25, 26 to 50 or 51 to 100, or 101 to 250 nucleotidesor amino acids are modified.

Nucleic Acid Molecules and Polypeptides Similar to K1, K1-H7 or K15

Those skilled in the art will recognize that the nucleic acid moleculesequences in the figures are not the only sequences, which may be usedto provide increased CD109 activity in cells. The genetic code isdegenerate so other nucleic acid molecules, which encode a polypeptideidentical to an amino acid sequence in the figures, may also be used.The sequences of the other nucleic acid molecules of this invention mayalso be varied without changing the polypeptide encoded by the sequence.Consequently, the nucleic acid molecule constructs described below andin the accompanying examples for the preferred nucleic acid molecules,vectors, and transformants of the invention are merely illustrative andare not intended to limit the scope of the invention.

The sequences of the invention can be prepared according to numeroustechniques. The invention is not limited to any particular preparationmeans. For example, the nucleic acid molecules of the invention can beproduced by cDNA cloning, genomic cloning, cDNA synthesis, polymerasechain reaction (PCR), or a combination of these approaches (CurrentProtocols in Molecular Biology (F. M. Ausbel et al., 1989)). Sequencesmay be synthesized using well-known methods and equipment, such asautomated synthesizers.

In one variation, the similar sequences are as shown in [SEQ ID NO:3, 7and 11]. The coding regions correspond to those shown in Table 1 a and 1b.

Sequence Identity

The invention includes modified nucleic acid molecules with a sequenceidentity at least about: >17%, >20%, >30%, >40%, >50%, >60%, >70%, >80%or >90% more preferably at least about >95%, >99% or >99.5%, to a DNAsequence in the figures (or a partial sequence thereof). Preferablyabout 1, 2, 3, 4, 5, 6 to 10, 10 to 25, 26 to 50 or 51 to 100, or 101 to250 nucleotides or amino acids are modified. Identity is calculatedaccording to methods known in the art. Sequence identity is mostpreferably assessed by the algorithm of the BLAST version 2.1 programadvanced search (parameters as above). For example, if a nucleotidesequence (called “Sequence A”) has 90% identity to a portion of thenucleotide sequence in FIG. 1, then Sequence A will be identical to thereferenced portion of the nucleotide sequence in FIG. 1, except thatSequence A may include up to 10 point mutations, such as substitutionswith other nucleotides, per each 100 nucleotides of the referencedportion of the nucleotide sequence in FIG. 1. Nucleotide sequencesfunctionally equivalent to the K1 [SEQ ID NO:1], K1-H7 [SEQ ID NO:5] orK15 [SEQ ID NO:9] sequence or their variants [SEQ ID NO:3, 7 and 11] canoccur in a variety of forms as described below. Polypeptides havingsequence identity may be similarly identified.

The polypeptides encoded by the homologous CD109 nucleic acid moleculein other species will have amino acid sequence identity at leastabout: >20%, >25%, >28%, >30%, >40% or >50% to an amino acid sequenceshown in the figures (or a partial sequence thereof). Some species mayhave polypeptides with a sequence identity of at leastabout: >60%, >70%, >80% or >90%, more preferably at leastabout: >95%, >99% or >99.5% to all or part of an amino acid sequence inthe figures (or a partial sequence thereof). Identity is calculatedaccording to methods known in the art. Sequence identity is mostpreferably assessed by the BLAST version 2.1 program advanced search(parameters as above). Preferably about: 1, 2, 3, 4, 5, 6 to 10, 10 to25, 26 to 50 or 51 to 100, or 101 to 250 nucleotides or amino acids aremodified.

The invention includes nucleic acid molecules with mutations that causean amino acid change in a portion of the polypeptide not involved inproviding CD109 activity or an amino acid change in a portion of thepolypeptide involved in providing CD109 activity so that the mutationincreases or decreases the activity of the polypeptide.

Hybridization

Other functional equivalent forms of the CD109 nucleic acid moleculesencoding nucleic acids can be isolated using conventional DNA-DNA orDNA-RNA hybridization techniques. These nucleic acid molecules and theCD109 sequences can be modified without significantly affecting theiractivity.

The present invention also includes nucleic acid molecules thathybridize to one or more of the sequences in the figures (or a partialsequence thereof) or their complementary sequences, and that encodepeptides or polypeptides exhibiting substantially equivalent activity asthat of a CD109 polypeptide produced by the DNA in the figures. Suchnucleic acid molecules preferably hybridize to all or a portion of CD109or its complement under low, moderate (intermediate), or high stringencyconditions as defined herein (see Sambrook et al. (most recent edition)Molecular Cloning: A Laboratory Manual, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.; Ausubel et al. (eds.), 1995, CurrentProtocols in Molecular Biology, (John Wiley & Sons, NY)). The portion ofthe hybridizing nucleic acids is typically at least 15 (e.g. 20, 25, 30or 50) nucleotides in length. The hybridizing portion of the hybridizingnucleic acid is at least 80% e.g. at least 95% or at least 98% identicalto the sequence or a portion or all of a nucleic acid encoding a CD109polypeptide, or its complement. Hybridizing nucleic acids of the typedescribed herein can be used, for example, as a cloning probe, a primer(e.g. a PCR primer) or a diagnostic probe. Hybridization of theoligonucleotide probe to a nucleic acid sample typically is performedunder stringent conditions. Nucleic acid duplex or hybrid stability isexpressed as the melting temperature or Tm, which is the temperature atwhich a probe dissociates from a target DNA. This melting temperature isused to define the required stringency conditions. If sequences are tobe identified that are related and substantially identical to the probe,rather than identical, then it is useful to first establish the lowesttemperature at which only homologous hybridization occurs with aparticular concentration of salt (e.g. SSC or SSPE). Then, assuming that1% mismatching results in a 1 degree Celsius decrease in the Tm, thetemperature of the final wash in the hybridization reaction is reducedaccordingly (for example, if sequences having greater than 95% identitywith the probe are sought, the final wash temperature is decreased by 5degrees Celsius). In practice, the change in Tm can be between 0.5degrees Celsius and 1.5 degrees Celsius per 1% mismatch. Low stringencyconditions involve hybridizing at about: 1×SSC, 0.1% SDS at 50° C. Highstringency conditions are: 0.1×SSC, 0.1% SDS at 65° C. Moderatestringency is about 1×SSC 0.1% SDS at 60 degrees Celsius. The parametersof salt concentration and temperature can be varied to achieve theoptimal level of identity between the probe and the target nucleic acid.

The present invention also includes nucleic acid molecules from anysource, whether modified or not, that hybridize to genomic DNA, cDNA, orsynthetic DNA molecules that encode the amino acid sequence of a CD109polypeptide, or genetically degenerate forms, under salt and temperatureconditions equivalent to those described in this application, and thatcode for a peptide, or polypeptide that has CD109 activity. Preferablythe polypeptide has the same or similar activity as that of a CD109polypeptide. A nucleic acid molecule described above is considered to befunctionally equivalent to a CD109 nucleic acid molecule (and therebyhaving CD109 activity) of the present invention if the polypeptideencoded by the nucleic acid molecule is recognized in a specific mannerby a CD109-specific antibody, including—but not restricted to—theantibodies listed in this application.

The invention also includes nucleic acid molecules and polypeptideshaving sequence similarity taking into account conservative amino acidsubstitutions. Sequence similarity (and preferred percentages) isdiscussed below.

Modifications to Nucleic Acid Molecule or Polypeptide Sequence

Changes in the nucleotide sequence which result in production of achemically equivalent or chemically similar amino acid sequences areincluded within the scope of the invention. Variants of the polypeptidesof the invention may occur naturally, for example, by mutation, or maybe made, for example, with polypeptide engineering techniques such assite directed mutagenesis, which are well known in the art forsubstitution of amino acids. For example, a hydrophobic residue, such asglycine can be substituted for another hydrophobic residue such asalanine. An alanine residue may be substituted with a more hydrophobicresidue such as leucine, valine or isoleucine. A negatively chargedamino acid such as aspartic acid may be substituted for glutamic acid. Apositively charged amino acid such as lysine may be substituted foranother positively charged amino acid such as arginine.

Therefore, the invention includes polypeptides having conservativechanges or substitutions in amino acid sequences. Conservativesubstitutions insert one or more amino acids, which have similarchemical properties as the replaced amino acids. The invention includessequences where conservative substitutions are made that do not destroyCD109 activity.

Polypeptides comprising one or more d-amino acids are contemplatedwithin the invention. Also contemplated are polypeptides where one ormore amino acids are acetylated at the N-terminus. Those of skill in theart recognize that a variety of techniques are available forconstructing polypeptide mimetics with the same or similar desired CD109activity as the corresponding polypeptide compound of the invention butwith more favorable activity than the polypeptide with respect tosolubility, stability, and/or susceptibility to hydrolysis andproteolysis. See, for example, Morgan and Gainor, Ann. Rep. Med. Chem.,24:243-252 (1989). Examples of polypeptide mimetics are described inU.S. Pat. No. 5,643,873. Other patents describing how to make and usemimetics include, for example in, U.S. Pat. No. 5,786,322, U.S. Pat. No.5,767,075, U.S. Pat. No. 5,763,571, U.S. Pat. No. 5,753,226, U.S. Pat.No. 5,683,983, U.S. Pat. No. 5,677,280, U.S. Pat. No. 5,672,584, U.S.Pat. No. 5,668,110, U.S. Pat. No. 5,654,276, U.S. Pat. No. 5,643,873.Mimetics of the polypeptides of the invention may also be made accordingto other techniques known in the art. For example, by treating apolypeptide of the invention with an agent that chemically alters a sidegroup by converting a hydrogen group to another group such as a hydroxyor amino group. Mimetics preferably include sequences that are eitherentirely made of amino acids or sequences that are hybrids includingamino acids and modified amino acids or other organic molecules.

For example, one may modify the bait region to obtain variants withaltered substrate specificity. One may also modify the hexapeptideregion to alter thioester reactivity.

The invention also includes hybrid nucleic acid molecules andpolypeptides, for example where a CD109 nucleotide sequence from onespecies is combined with a nucleotide sequence from a sequence of plant,mammal, bacteria or yeast to encode a fusion polypeptide. The inventionincludes a fusion protein having at least two components, wherein afirst component of the fusion protein comprises a polypeptide of theinvention, preferably a full length CD109 polypeptide (or a portionthereof, see below). The second component of the fusion proteinpreferably comprises a tag, for example GST, an epitope tag or anenzyme. The fusion protein may also comprise a histochemical orcytochemical marker such as lacZ, alkaline phosphatase, or horseradishperxidase, or a fluorescent marker such as GFP or one of itsderivatives.

The invention also includes polypeptide fragments of the polypeptides ofthe invention which may be used to confer CD109 activity if thefragments retain activity. The invention also includes polypeptidesfragments of the polypeptides of the invention which may be used as aresearch tool to characterize the polypeptide or its activity. Suchpolypeptides preferably consist of at least 5 amino acids. In preferredembodiments, they may consist of 6 to 10, 11 to 15, 16 to 25, 26 to 50,51 to 75, 76 to 100 or 101 to 250 amino acids of the polypeptides of theinvention (or longer amino acid sequences). The fragments preferablyhave CD109 activity. Fragments may include sequences with one or moreamino acids removed, for example, C-terminus amino acids in a CD109sequence.

The invention also includes a composition comprising all or part of anisolated nucleic acid molecule (preferably K1 [SEQ ID NO:1], K1-H7 [SEQID NO:5] or K15 [SEQ ID NO:9] sequence or their variants [SEQ ID NO:3, 7and 11]) of the invention with or without a carrier, preferably in acomposition for cell transformation. The invention also includes acomposition comprising an isolated CD109 polypeptide (preferably K1 [SEQID NO:2], K1-H7 [SEQ ID NO: 6] or K15 [SEQ ID NO:10] or their variants[SEQ ID NO:4, 8 or 12] with or without a carrier, preferably forstudying or modulating polypeptide activity.

Recombinant Nucleic Acid Molecules

The invention also includes recombinant nucleic acid molecules,preferably a K1 [SEQ ID NO:1], K1-H7 [SEQ ID NO:5] or K15 [SEQ ID NO:9]sequence or their variants [SEQ ID NO:3, 7, or 11] of the figurescomprising a nucleic acid molecule of the invention and a promotersequence, operatively linked so that the promoter enhances transcriptionof the nucleic acid molecule in a host cell (the nucleic acid moleculesof the invention may be used in an isolated native gene or a chimericgene, for example, where a nucleic acid molecule coding region isconnected to one or more heterologous sequences to form a gene. Thepromoter sequence is preferably a constitutive promoter sequence or aninducible promoter sequence, operatively linked so that the promoterenhances transcription of the DNA molecule in a host cell. The promotermay be of a type not naturally associated with the cell such as a superpromoter, a chemical or drug inducible promoter, a steroid-induciblepromoter and a tissue specific promoter. The CMV and SV40 promoters arecommonly used to express desired polypeptide in mammalian cells. Otherpromoters known in the art may also be used (many suitable promoters andvectors are described in the applications and patents referenced in thisapplication). Tissue-specific promoters could include the vav or H2Kpromoters (all hematopoietic cells), PF4 or GP1b promoters(megakaryocytes and platelets), or the lck, CD2, or granzymeB promoters(T lymphocytes), and many others. Non-tissue specific promoters couldinclude Beta actin, PGK, or CMV promoters, or retroviral LTRs, and manyothers. Inducible promoters could include the metallothionenin IIApromoter, or ecdysone inducible or tetracycline inducible or repressiblepromoters, among many others.

A recombinant nucleic acid molecule for conferring CD109 activity mayalso contain suitable transcriptional or translational regulatoryelements. Suitable regulatory elements may be derived from a variety ofsources, and they may be readily selected by one with ordinary skill inthe art (Sambrook, J, Fritsch, E. E. & Maniatis, T. (most recentedition). Molecular Cloning: A laboratory manual. Cold Spring HarborLaboratory Press. New York; Ausubel et al. (Most Recent Edition) CurrentProtocols in Molecular Biology, John Wiley & Sons, Inc.). For example,if one were to upregulate the expression of the nucleic acid molecule,one could insert a sense sequence and the appropriate promoter into thevector. If one were to downregulate the expression of the nucleic acidmolecule, one could insert the antisense sequence and the appropriatepromoter into the vehicle. Examples of regulatory elements include: anenhancer or RNA polymerase binding sequence, a terminator region, aribosomal binding sequence, including a translation initiation signal.The regulatory elements described above may be from animal, plant,yeast, bacteria, fungus, virus or other sources, including syntheticallyproduced elements and mutated elements. Additionally, depending on thevector employed, other genetic elements, such as selectable markers, maybe incorporated into the recombinant molecule. Markers facilitate theselection of a transformed host cell. Such markers include genesassociated with temperature sensitivity, drug resistance, or enzymesassociated with phenotypic characteristics of the host organisms.

Methods of modifying DNA and polypeptides, preparing recombinant nucleicacid molecules and vectors, transformation of cells, expression ofpolypeptides are known in the art. For guidance, one may consult thefollowing U.S. Pat. Nos. 5,840,537, 5,850,025, 5,858,719, 5,710,018,5,792,851, 5,851,788, 5,759,788, 5,840,530, 5,789,202, 5,871,983,5,821,096, 5,876,991, 5,422,108, 5,612,191, 5,804,693, 5,847,258,5,880,328, 5,767,369, 5,756,684, 5,750,652, 5,824,864, 5,763,211,5,767,375, 5,750,848, 5,859,337, 5,563,246, 5,346,815, and WO9713843.Many of these patents also provide guidance with respect to experimentalassays, probes and antibodies, methods, transformation of host cells,which are described below. These patents, like all other patents,publications (such as articles and database publications) in thisapplication, are incorporated by reference in their entirety.

Host Cells Including a CD109 Nucleic Acid Molecule

Levels of nucleic acid molecule expression may be controlled withnucleic acid molecules or nucleic acid molecule fragments that code forsense or anti-sense RNA. In a preferred embodiment of the invention, acell (preferably a human cell) is transformed with a nucleic acidmolecule of the invention or a fragment of a nucleic acid moleculeinserted in a vector. The expression host may be any cell capable ofexpressing CD109, such as a cell selected from the group consisting of amammalian cell, bacterium, yeast, fungus, protozoa or algae.

Another embodiment of the invention relates to the method oftransforming a host cell with a nucleic acid molecule of the inventionor a fragment of a nucleic acid molecule, inserted in a vector. Theinvention also includes the vector comprising a nucleic acid molecule ofthe invention. The nucleic acid molecules can be cloned into a varietyof vectors by means that are well known in the art. The recombinantnucleic acid molecule may be inserted at a site in the vector created byrestriction enzymes. A number of suitable vectors may be used, includingcosmids, plasmids, bacteriophage, baculoviruses and viruses. Suitablevectors are capable of reproducing themselves and transforming a hostcell. The invention also relates to a method of expressing polypeptidesin the host cells.

Host cells may be cultured in conventional nutrient media. The media maybe modified as appropriate for inducing promoters, amplifying genes orselecting transformants. The culture conditions, such as temperature,composition and pH will be apparent. After transformation, transformantsmay be identified on the basis of a selectable phenotype. A selectablemarker in the vector can confer a selectable phenotype.

Methods known in the art, including but not limited to electroporation,calcium phosphate or chloroquine transfection, viral infection,microinjection, and the use of cationic lipid and lipid/amino acidcomplexes, or of liposomes, or a large variety of other commerciallyavailable, and readily synthesized transfection adjuvants, are useful totransfer a CD109 nucleic acid molecule into host cells. The inventionalso includes a method for constructing a host cell capable ofexpressing a nucleic acid molecule of the invention, the methodcomprising introducing into said host cell a vector of the invention.The genome of the host cell may or may not also include a functionalCD109 gene. The invention also includes a method for expressing a CD109polypeptide such as a K1 [(SEQ ID NO:2], K1-H7 [SEQ ID NO: 6] or K15[SEQ ID NO:10] or their variants [SEQ ID NO:4, 8 or 12] in the hostcell, the method comprising culturing the host cell in a culture mediumunder conditions suitable for gene expression so that the polypeptide isexpressed. The process preferably further includes recovering thepolypeptide from the cells or culture medium.

Antisense Technology for Inhibition of CD109

To reduce the abundance and thus the activity of the target protein,coding sequences typically obtained from cDNAs are expressed in thereverse orientation in transgenic cells so that the resultant RNAgenerated is complementary to the endogenous mRNA encoding the targetprotein. The binding of these two RNAs inhibits the translation of thetarget mRNA, thereby blocking or reducing the synthesis of thecorresponding protein. Expression of the antisense RNA is usuallyaccomplished using vectors that contain highly active promotersequences, which synthesize an abundance of the antisense RNA. Patentsthat describe various uses and modifications of antisense technologyinclude: U.S. Pat. Nos. 6,133,246, 6,096,722, 6,040,296, 5,801,159 and5,739,119.

The nucleotide sequence encoding the antisense RNA molecule cantheoretically be of any length, providing that the antisense RNAmolecule transcribable therefrom is sufficiently long so as to be ableto form a complex with a sense mRNA molecule encoding a CD109polypeptide. The antisense RNA molecule complexes with the mRNA encodingthe polypeptide and thereby reduces the half-life of the CD109 mRNA,and/or inhibits or reduces the synthesis of CD109. As a consequence ofthis interference by the antisense RNA, the activity of the CD109polypeptides is decreased. The antisense RNA preferably comprises asequence that is complementary to a portion of the coding sequence forCD109 shown in the figures, or a portion thereof, or preferablycomprises a sequence having at least 20%, 30%, 40%, 50%, 60%, 70%, 80%,90% or 95% sequence identity to CD109 shown in the figures, or a portionthereof (sequence identity is determined as described above). Thesequence may include the 5′-terminus, be downstream from the5′-terminus, or may cover all or only a portion of the non-codingregion, may bridge the non-coding and coding region, be complementary toall or part of the coding region, be complementary to the 3′-terminus ofthe coding region, or be complementary to the 3′-untranslated region ofthe mRNA. The particular site(s) to which the anti-sense sequence bindsand the length of the anti-sense sequence will vary, for example,depending upon the degree of inhibition desired, the uniqueness of thesequence, and the stability of the anti-sense sequence.

The sequence may be a single sequence or a repetitive sequence havingtwo or more repetitive sequences in tandem, where the single sequencemay bind to a plurality of messenger RNAs. The antisense sequence may becomplementary to a unique sequence or a repeated sequence, so as toenhance the probability of binding. The antisense sequence may beinvolved with the binding of a unique sequence, a single unit of arepetitive sequence or of a plurality of units of a repetitive sequence.In some instances, rather than providing for homoduplexing,heteroduplexing may be employed, where the same sequence may provide forinhibition of a plurality of messenger RNAs by having regionscomplementary to different messenger RNAs. The antisense sequence mayalso contain additional nucleotide sequence unrelated to the sequenceencoding CD109. This unrelated sequence may be flanked on one or bothsides by CD109-related sequence, or may be located at one or both endsof the CD109-related sequence. The transcriptional construct willpreferably include, in the direction of transcription, a transcriptionalinitiation region, the sequence coding for the antisense RNA on thesense strand, and a transcriptional termination region.

The DNA encoding the antisense RNA can vary in length from less than 20nucleotides in length, up to about the length of the corresponding mRNAproduced by the cell. For example, the length of the DNA encoding theantisense RNA can be from less than 20 to 1500, 2000, 3000, or more,nucleotides in length. The anti-sense sequence complementary to aportion of the sequence of the messenger RNA will usually be at leastabout less than 20, 20, 30, 50, 75 or 100 nucleotides or more, and oftenbeing fewer than about 1000 nucleotides in length. The preferred sourceof antisense RNA for DNA constructs of the present invention is DNA thatis complementary to a full length CD109, or fragments thereof. DNAshowing substantial sequence identity to the complement of CD109 orfragments thereof is also useful, and is encompassed by this invention.

Suitable promoters are described elsewhere in this application and knownin the art. The promoter gives rise to the transcription of a sufficientamount of the antisense RNA molecule at a rate sufficient to cause areduction of CD109 protein in cells. The required amount of antisenseRNA to be transcribed may vary from cell to cell. Other regulatoryelements described in this application, such as enhancers andterminators may also be used. The invention also includes a vector, suchas a plasmid or virus encompassing the antisense DNA.

The invention includes the cells (for example, the cells of the specieslisted above) containing the antisense sequence. The invention furtherprovides tissues comprising such cells and the progeny of such cells,which contain the DNA sequence stably incorporated and hereditable.

The invention also includes the use of a sequence according to theinvention, in the production of cells having a modified CD109 content.By “modified CD109 content” is meant a cell, which exhibits nonwild-type levels of CD109 due to inhibited or reduced expression ofCD109.

The invention still further provides a method of inhibiting or reducingexpression of a CD109 polypeptide in cells, comprising introducing intosuch cells a nucleic acid molecule according to the invention, such asCD109 antisense DNA, or a vector containing such DNA. In one example,the invention includes a method for reducing expression of a nucleicacid molecule encoding a CD109 polypeptide, such as CD109, comprising:a) integrating into the genome of a cell, or expressing transientlywithin the cell without integration, a nucleic acid moleculecomplementary to all or part of endogenous CD109 mRNA; and b) growingthe transformed cell, so that the complementary nucleic acid molecule istranscribed and binds to the CD109 mRNA, thereby reducing expression ofthe nucleic acid molecule encoding the CD109 polypeptide, and therebyresulting in reduced CD109 synthesis. Typically, the amount of RNAtranscribed from the complementary strand is less than the amount of themRNA endogenous to the cell.

The antisense DNA may also comprise a nucleic acid molecule encoding amarker polypeptide, the marker polypeptide also operably linked to apromoter.

Fragments/Probes

Preferable fragments include 10 to 50, 50 to 100, 100 to 250, 250 to500, 500 to 1000, 1000 to 1500, or 1500 or more nucleotides of a nucleicacid molecule of the invention. A fragment may be generated by removingone or more nucleotides from a sequence in the figures (or a partialsequence thereof). Fragments may or may not encode a polypeptide havingCD109 activity.

The nucleic acid molecules of the invention (including a fragment of asequence in the figures (or a partial sequence thereof) can be used asprobes to detect nucleic acid molecules according to techniques known inthe art (for example, see U.S. Pat. Nos. 5,792,851 and 5,851,788; E. S.Kawasaki (1990), In Innis et al., Eds., PCR Protocols, Academic Press,San Diego, Chapter 3 re PCR and reverse transcriptase). The probes maybe used to detect nucleic acid molecules that encode polypeptidessimilar to the polypeptides of the invention. For example, a probehaving at least about 10 bases will hybridize to similar sequences understringent hybridization conditions (Sambrook et al. 1989, MolecularCloning, A Laboratory Manual, Cold Spring Harbor). Polypeptide fragmentsof K1 [SEQ ID NO:2], K1-H7 [SEQ ID NO: 6] or K15 [SEQ ID NO:10] or theirvariants [SEQ ID NO:4, 8 or 12] are preferably at least 8 amino acids inlength and are useful, for example, as immunogens for raising antibodiesthat will bind to intact protein (immunogenic fragments). Typically theaverage length used for synthetic peptides is 8-16, 8 being the minimum,however 12 amino acids is commonly used. Cloning and expression of theDNA assess the activity of the polypeptide encoded by the nucleic acidmolecule. After the expression product is isolated the polypeptide isassayed for activity as described in this application.

Enhancement of CD109 Polypeptide Activity

The activity of the CD109 polypeptide is increased or decreased bycarrying out selective site-directed mutagenesis. Using protein modelingand other prediction methods, we characterize the binding domain andother critical amino acid residues in the polypeptide that arecandidates for mutation, insertion and/or deletion. A DNA plasmid orexpression vector containing the CD109 nucleic acid molecule or anucleic acid molecule having sequence identity is preferably used forthese studies using the U.S.E. (Unique site elimination) mutagenesis kitfrom Pharmacia Biotech or other mutagenesis kits that are commerciallyavailable, or using PCR. Once the mutation is created and confirmed byDNA sequence analysis, the mutant polypeptide is expressed using anexpression system and its activity is monitored. This approach is usefulnot only to enhance activity, but also to engineer some functionaldomains for other properties useful in the purification or applicationof the polypeptides or the addition of other biological functions. It isalso possible to synthesize a DNA fragment based on the sequence of thepolypeptides that encodes smaller polypeptides that retain activity andare easier to express. It is also possible to modify the expression ofthe cDNA so that it is induced under desired environmental conditions orin response to different chemical inducers or hormones. It is alsopossible to modify the DNA sequence so that the polypeptide is targetedto a different location. All these modifications of the DNA sequencespresented in this application and the polypeptides produced by themodified sequences are encompassed by the present invention.

Pharmaceutical Compositions

The CD109 nucleic acid molecule or its polypeptide and functionalequivalent nucleic acid molecules or polypeptides are useful when usedalone, but are also useful when combined with other components such as acarrier in a pharmaceutical composition.

CD109 is expressed on hematopoietic stem and progenitor cells,endothelial cells, and activated platelets and T cells, and s capable ofcovalent substrate binding and protease inhibition. CD109 is useful as aprotease inhibitor. All or part of CD109 may be administered to asubject, such as a human, in soluble form as a therapeutic modulator ofendothelial or of platelet function, of the blood coagulation (as ananticoagulant) or fibrinolytic systems, or of immune function, includingT cell effector function, antigen presentation, and complementactivation. Specifically, pharmaceutical compositions of this inventionare used to treat patients having degenerative diseases, disorders orabnormal physical states, including but not limited to conditionsassociated with endothelial activation, platelet activation, activationof the coagulation or fibrinolytic systems, and activation of Tlymphocytes, the complement system, and the immune system. Specifically,such disorders may include (but are not limited to) cardiovasculardisorders including stroke, myocardial infarction, thrombosis, andembolism, and peripheral vascular disease; disorders associated withquantitative or qualitative abnormalities of platelet function,including thrombocytopenia, thrombocythemia, and conditions associatedwith increased or impaired platelet aggregation and activation;conditions associated with increased or impaired activation of thecoagulation and/or fibrinolytic systems; and conditions associated withimpaired or increased immune activation, including autoimmune diseasesas well as organ and bone marrow transplantation. CD109 peptide andprotein reagents may be used not only for the treatment of suchconditions, but also for their diagnosis and prevention.

The CD109 compositions are useful when administered in methods ofmedical treatment, prevention, or diagnosis of a disease, disorder orabnormal physical state characterized by insufficient CD109 expressionor inadequate levels or activity of CD109 polypeptide by increasingexpression, concentration or activity. The invention also includesmethods of medical treatment, prevention, diagnosis, of a disease,disorder or abnormal physical state characterized by excessive CD109expression or levels or activity of CD109 polypeptide, for example byadministering a pharmaceutical composition, possibly including a carrierand a vector, that expresses CD109 antisense DNA, or a vector thatexpresses an inactive mutant or variant form of CD109, or by theadministration of CD109 specific antibodies that interfere with CD109activity, or lead to decreased CD109 levels. The invention also includesmeasurement of cell associated or soluble CD109 levels for diagnosticfollow up and risk assessment or prognostic information purposes. Forexample, one may monitor progression of autoimmune disease or responsesto organ transplantation. The invention also includes methods of medicaltreatment, prevention, or diagnosis of a disease, disorder or abnormalphysical state characterized by neither increased nor reduced CD109expression or levels or activity of CD109 polypeptide, but in which themodulation of CD109 levels or activity may be of therapeutic,preventive, or of diagnostic value. An agent that upregulates CD109 geneexpression or CD109 polypeptide activity may be combined with a carrierto form a pharmaceutical composition. An agent that downregulates CD109expression or CD109 polypeptide activity may be combined with a carrierto form a pharmaceutical composition.

The pharmaceutical compositions can be administered to humans or animalsby a variety of methods including, but not restricted to topicaladministration, oral administration, aerosol administration,intratracheal instillation, intraperitoneal injection, injection intothe cerebrospinal fluid, and intravenous injection in methods of medicaltreatment involving upregulating or downregulating CD109 gene orpolypeptide levels or activity. Dosages to be administered depend onpatient needs, on the desired effect and on the chosen route ofadministration. Nucleic acid molecules and polypeptides may beintroduced into cells using in vivo delivery vehicles such as liposomes.They may also be introduced into these cells using physical techniquessuch as microinjection and electroporation or chemical methods such ascoprecipitation or using liposomes.

The pharmaceutical compositions can be prepared by known methods for thepreparation of pharmaceutically acceptable compositions which can beadministered to patients, and such that an effective quantity of thenucleic acid molecule or polypeptide is combined in a mixture with apharmaceutically acceptable vehicle. Suitable vehicles are described,for example in Remington's Pharmaceutical Sciences (Remington'sPharmaceutical Sciences, Mack Publishing Company, Easton, Pa., USA).

On this basis, the pharmaceutical compositions could include an activecompound or substance, such as a CD109 nucleic acid molecule orpolypeptide, in association with one or more pharmaceutically acceptablevehicles or diluents, and contained in buffered solutions with asuitable pH and isoosmotic with the physiological fluids. The methods ofcombining the active molecules with the vehicles or combining them withdiluents is well known to those skilled in the art. The compositioncould include a targeting agent for the transport of the active compoundto specified sites within tissue.

Administration of CD109 Nucleic Acid Molecule by Gene Therapy

Since persons suffering from disease, disorder or abnormal physicalstate can be treated by either up or down regulation of CD109, genetherapy to increase or reduce CD109 expression is useful to modify thedevelopment/progression of disease. For example, to treat ahypercoagulable state, gene therapy (for example, targeting CD109expression to endothelial or blood cells) could be used to enhance CD109anticoagulant activity, thereby decreasing the propensity for blood clotformation.

The invention also includes methods and compositions for providing genetherapy for treatment of diseases, disorders or abnormal physical statescharacterized by insufficient CD109 expression or inadequate levels oractivity of CD109 polypeptide (see the discussion of pharmaceuticalcompositions, above) involving administration of a pharmaceuticalcomposition of the invention. The invention also includes methods andcompositions for providing gene therapy for treatment of diseases,disorders or abnormal physical states characterized by excessive CD109expression or levels of activity of CD109 polypeptide involvingadministration of a pharmaceutical composition.

The invention includes methods and compositions for providing a nucleicacid molecule encoding CD109 or functional equivalent nucleic acidmolecule to the cells of an individual such that expression of CD109 inthe cells provides the biological activity or phenotype of CD109polypeptide to those cells. Sufficient amounts of the nucleic acidmolecule are administered and expressed at sufficient levels to providethe biological activity or phenotype of CD109 polypeptide to the cells.For example, the method can preferably involve a method of delivering anucleic acid molecule encoding CD109 to the cells of an individualhaving a disease, disorder or abnormal physical state, comprisingadministering to the individual a vector comprising DNA encoding CD109.The method may also relate to a method for providing an individualhaving a disease, disorder or abnormal physical state with biologicallyactive CD109 polypeptide by administering DNA encoding CD109. The methodmay be performed ex vivo or in vivo. Methods and compositions foradministering CD109 (including in gene therapy) are explained, forexample, in U.S. Pat. Nos. 5,672,344, 5,645,829, 5,741,486, 5,656,465,5,547,932, 5,529,774, 5,436,146, 5,399,346 and 5,670,488, 5,240,846which are incorporated by reference in their entirety.

The method also relates to a method for producing a stock of recombinantvirus by producing virus suitable for gene therapy comprising DNAencoding CD109. This method preferably involves transfecting cellspermissive for virus replication (the virus containing the nucleic acidmolecule) and collecting the virus produced.

The invention also includes methods and compositions for providing anucleic acid molecule encoding an antisense sequence to CD109 such thatexpression of the sequence prevents CD109 biological activity orphenotype or reduces CD109. The methods and compositions can be used invivo or in vitro. Sufficient amounts of the nucleic acid molecule areadministered and expressed at sufficient levels to reduce the biologicalactivity or phenotype of CD109 polypeptide in the cells. Similar methodsas described in the preceding paragraph may be used with appropriatemodifications.

The methods and compositions can be used in vivo or in vitro. Theinvention also includes compositions (preferably pharmaceuticalcompositions for gene therapy). The compositions include a vectorcontaining CD109. The carrier may be a pharmaceutical carrier or a hostcell transformant including the vector. Vectors known in the art includebut are not restricted to retroviruses, adenoviruses, adeno associatedvirus (AAV), herpes virus vectors, such as vaccinia virus vectors, HIVand lentivirus-based vectors, and plasmids. The invention also includespackaging and helper cell lines that are required to produce the vector.Methods of producing the vector and methods of gene therapy using thevector are also included with the invention.

The invention also includes a transformed cell, such as a blood cellcontaining the vector and the recombinant CD109 nucleic acid moleculesense or antisense sequences.

Heterologous Expression of CD109

Expression vectors are useful to provide high levels of polypeptideexpression. Cell cultures transformed with the nucleic acid molecules ofthe invention are useful as research tools particularly for studies ofCD109. Cell cultures are used in overexpression and research accordingto numerous techniques known in the art. For example, a cell line(either an immortalized cell culture or a primary cell culture) may betransfected with a vector containing a CD109 nucleic acid molecule (ormolecule having sequence identity) to measure levels of expression ofthe nucleic acid molecule and the activity of the nucleic acid moleculeand polypeptide. A polypeptide of the invention may be used in an assayto identify compounds that bind the polypeptide (assays may be adopted,for example, from U.S. Pat. No. 5,851,788). Methods known in the art maybe used to identify agonists and antagonists of the polypeptides. Onemay obtain cells that do not express CD109 endogenously and use them inexperiments to assess CD109 nucleic acid molecule expression.Experimental groups of cells may be transfected with vectors containingdifferent types of CD109 nucleic acid molecules (or nucleic acidmolecules having sequence identity to CD109 or fragments of CD109nucleic acid molecule) to assess the levels of polypeptide produced, itsfunctionality and the phenotype of the cells produced. Other expressionsystems can also be utilized to overexpress the CD109 in recombinantsystems. The polypeptides are also useful for in vitro analysis of CD109activity. For example, the polypeptide produced can be used formicroscopy or X-ray crystallography studies, and the tertiary structureof individual domains may be analyzed by NMR spectroscopy.

Experiments may be performed with cell cultures or in vivo to identifypolypeptides that bind to different domains of CD109. One could alsotarget thioester domains, such as the thioester signature sequence orthe hexapeptide motif. For example, thioester activity could be blockedto study the effects on CD109. Another example is blocking the membranedomain to prevent membrane localization of CD109. Similar approachescould be taken to study other polypeptide domains or motifs.

Preparation of Antibodies

The invention includes an isolated antibody immunoreactive with apolypeptide of the invention. Antibodies are preferably generatedagainst epitopes of native K1 [SEQ ID NO:2], K1-H7 [SEQ ID NO:6] or K15[SEQ ID NO:10] or their variants [SEQ ID NO:4, 8 or 12] or syntheticpeptides of K1 [SEQ ID NO:2], K1-H7 [SEQ ID NO:6] or K15 [SEQ ID NO:10]or their variants [SEQ ID NO:4, 8, or 12]. The antibody may be labeledwith a detectable marker or unlabeled. The antibody is preferably amonoclonal antibody or a polyclonal antibody. CD109 antibodies can beemployed to screen organisms containing CD109 polypeptides. Theantibodies are also valuable for immuno-purification of polypeptidesfrom crude extracts. For example, one may contact a biological samplewith the antibody under conditions allowing the formation of animmunological complex between the antibody and a polypeptide recognizedby the antibody and detecting the presence or absence of theimmunological complex whereby the presence of CD109 or a similarpolypeptide is detected in the sample. The invention also includescompositions preferably including the antibody, a medium suitable forthe formation of an immunological complex between the antibody and apolypeptide recognized by the antibody and a reagent capable ofdetecting the immunological complex to ascertain the presence of CD109or a similar polypeptide.

To recognize CD109, one may generate antibodies against a range ofunique epitopes throughout the molecule. To block activity of CD109, onecould generate antibodies that target the thioester signature sequence,or the thioester reactivity defining hexapeptide motif, to blockthioester activity, or the bait region, to block initial proteolyticactivation of CD109. In addition, these antibodies, or other antibodiesdirected against other CD109 epitopes could block CD109 activity byleading to enhanced. CD109 clearance/degradation, with a concomitantdecrease in CD109 activity.

Monoclonal and polyclonal antibodies are prepared according to thedescription in this application and techniques known in the art. Forexamples of methods of the preparation and uses of monoclonalantibodies, see U.S. Pat. Nos. 5,688,681, 5,688,657, 5,683,693,5,667,781, 5,665,356, 5,591,628, 5,510,241, 5,503,987, 5,501,988,5,500,345 and 5,496,705 that are incorporated by reference in theirentirety. Examples of the preparation and uses of polyclonal antibodiesare disclosed in U.S. Pat. Nos. 5,512,282, 4,828,985, 5,225,331 and5,124,147 which are incorporated by reference in their entirety.

The invention also includes methods of using the antibodies. Forexample, the invention includes a method for detecting the presence of aCD109 polypeptide such as K1 [SEQ ID NO:2], K1-H7 [SEQ ID NO:6] or K15[SEQ ID NO:10] or their variants [SEQ ID NO:4, 8 or 12], by: a)contacting a sample containing one or more polypeptides with an antibodyof the invention under conditions suitable for the binding of theantibody to polypeptides with which it is specifically reactive; b)separating unbound polypeptides from the antibody; and c) detectingantibody which remains bound to one or more of the polypeptides in thesample.

Identification of Epitopes and Purification of Hematopoietic Stem Cells

The sequences of the invention are used to map the epitopes recognizedby anti-CD109 antibodies (see Table 3), including but not restricted to8A3, 23/5F6, 39/6C3, TEA2/16, D2, LDA1, 7D1, 40B8, 1B3, 59D6, 8A1, 7C5,and B-E47. Identification of these epitopes allows the production ofcompeting antibody-specific peptides that are useful for eluting CD109selected cells from anti-CD109-based immuno-affinity matrices.

To date, existing CD109 antibodies are known to recognize at least fivedistinct epitopes that have been mapped as follows:

The simplest method demonstrates whether various antibodies recognizethe same or distinct epitopes. This method determines whether unlabelledmonoclonal antibodies available are able to block the binding oflabelled antibodies. For example, an unlabelled antibody that recognizesthe same epitope as a labelled antibody will block binding of thelabelled antibody to the surface of a cell expressing the antigen ofinterest. Thus, decreased label will be found to bind to the cell. Incontrast, an unlabelled antibody that does not bind to the same epitopewill not inhibit the binding of the labelled antibody. A related methodinvolves assessing whether the binding of distinct labelled antibodiesto the target cell is additive. If the binding of two antibodiestogether (as determined by measuring cell-associated label) is greaterthan that of either antibody used individually, the two antibodieslikely recognize distinct, non-overlapping epitopes. If the binding oftwo antibodies together (as determined by measuring cell-associatedlabel) is greater than that of either antibody used individually, but isless than would be expected by adding the individual binding of the twoantibodies together, then the two antibodies likely recognize distinctbut closely apposed epitopes.

Using such techniques with biotinylated- or phycoerythrin-conjugatedantibodies, we have performed epitope-mapping studies on several of theassigned CD109 antibodies.

We have determined that there exist at least five discrete CD109epitopes recognized by different CD109 antibodies:

TABLE 2 Epitope I II III* IV V clone 7D1 8A3, 7C5 8A1 40B8 1B3Additionally, clones LDA1, Tea2/16, and 59D6 recognize epitopes distinctfrom epitope II *Epitope III is close to epitope II.

The invention includes methods of identifying such antibody-specificCD109 peptides for competitively binding to a CD109 antibody in thepresence of CD109+ cells, including: (a) providing peptide fragments of[SEQ ID NO:2, 4, 6 or 8] (for example 5-10, 10-15 or 15-20 amino acidsequences); and (b) determining whether the fragments competitively bindto the CD109 antibody. Several approaches to defining such peptides areknown according to techniques known in the art. Methods for identifyingsuch peptides include, but are not restricted to, the following:

In one approach, polypeptides corresponding to overlapping CD109 cDNAsubfragments are evaluated for antibody binding. Antibody specificpeptides are identified by the sequential generation and expression ofprogressively smaller cDNA fragments, and ultimately by the synthesis ofspecific peptides. The ability of minimal binding peptides to blockantibody binding to recombinant and native CD109 and CD109 expressingcells, and the ability of such peptides to elute CD109 and CD109expressing cells from anti-CD109-based immuno-affinity matrices is thenconfirmed.

In an alternate approach, CD109 antibody specific peptides areidentified by a phage-display, bacterial display or by an analogousCD109 cDNA expression library selection method, with the specificity ofthe resultant peptides evaluated as above. A phage display library ismade from the cDNA encoding the molecule of choice such that each phagein the library expresses a cDNA fragment encoding a small peptide (6amino acids, for example) portion of the molecule of interest. Overall,the entire library comprises a series of overlapping cDNA fragments suchthat the corresponding overlapping peptides represent the entiremolecule. The phage library is then panned on a series of petri dishes,each pre-coated with one of the antibody clones. Phage expressing theappropriate hexapeptide recognised by the particular antibody clonebecome bound to the dish, while the rest can be washed away. After aseries of similar panning/recloning steps, DNA from the cloned phage issequenced to determine which fragment of the original cDNA it contains,and which amino acid sequence it encodes. By comparison with the overallcDNA, the precise location of the epitope can be mapped. In a relatedmethod, the phage library does not correspond directly to the cDNA ofinterest, but rather is composed of random cDNA fragments of definedsize that encode random peptides. Screening and analysis is as above.Synthetic peptides are also screened directly by a similar approach.More recently, the ability to map antibody-specific epitopes by massspectrography and with the use of biosensors has also been described.

References to Epitope Mapping:

-   Westerlund-Wikstrom B. et al. Peptide display on bacterial flagella:    principles and applications. Int. J. Med. Microbiol. July 2000;    290(3):223-30.-   Reineke U, et al. Antigen sequence- and library-based mapping of    linear and discontinuous protein-protein-interaction sites by spot    synthesis. Curr Top Microbiol. Immunol. 1999; 243:23-36.-   DeLisser H M. Epitope mapping. Methods Mol Biol. 1999; 96:11-20.-   Van Regenmortel M H, et al. Measurement of antigen-antibody    interactions with biosensors. J Mol Recognit. 1998 Winter;    11(1-6):163-7.-   Felici F, et al. Peptide and protein display on the surface of    filamentous bacteriophage. Biotechnol Annu Rev. 1995; 1:149-83.-   Steen R, et al. CD34 molecule epitope distribution on cells of    haematopoietic origin. Leuk Lymphoma. June 1998; 30(1-2):23-30.-   Van de Water J, et al. Detection of molecular determinants and    epitope mapping using MALDI-TOF mass spectrometry. Clin Immunol    Immunopathol. December 1997; 85(3):229-35.-   Burton D R. Phage display. Immunotechnology. August 1995;    1(2):87-94.-   Gershoni J M, et al. Combinatorial libraries, epitope structure and    the prediction of protein conformations. Immunol Today. March 1997;    18(3): 108-10.-   Cortese R, et al. Selection of biologically active peptides by phage    display of random peptide libraries. Curr Opin Biotechnol. December    1996; 7(6):616-21.-   Tseng-Law J, et al. Identification of a peptide directed against the    anti-CD34 antibody, 9C5, by phage display and its use in    hematopoietic stem cell selection studies. Exp Hematol 27:936-45,    1999.

CD109+ cells are preferably identified and purified by contacting asample (such as bone marrow or peripheral blood mononuclear censor afraction of such bone marrow or peripheral blood cells) with a CD109antibody. A variety of methods are available for the subsequentpurification of such antibody labelled cells.

When relatively few purified cells are required, the most efficientmethod involves the use of flow cytometry on a fluorescence activatedcell sorter (FACS). Cells are stained as above with CD3/CD109. A sortgate delineating CD3+/CD109+ cells is established, and events containedwithin this gate are sorted (separated) by FACS.

Cells expressing a specific antigen can also be sorted by a variety ofmeans involving antibodies immobilized on a support or matrix, such asdishes or beads. The latter can be used in suspension, or in the form ofa column.

The antigen-specific antibody (antibody 8A3, for example) can besoluble, or directly conjugated to the beads. In the former case, a cellsuspension containing CD109+ cells can be incubated with unlabelledCD109 antibodies for 30 minutes at 4 C. After removal of excess unboundantibody, the CD109+ cells can be fractionated (sorted or separated)using beads (magnetic or paramagnetic particles, for example) coatedwith antibodies recognizing murine immunoglobulins. CD109+ cells coatedwith CD109 antibody will bind to the anti-murine immunoglobulin coatedbeads. The cells adhering to the magnetic beads can then be captured bya magnet, or retained in a column, while CD109− cells will not. In thelatter case, the CD109 antibodies are conjugated directly to the beads.In either case, the CD109+ cells must then be removed from the bindingantibody before use. A number of methods for eluting the desired cellshave been used, including competitive elution with a short polypeptidecorresponding to the epitope recognized by the antibody used to selectthe CD109+ cells [This is the method of Baxter/Nexell to select marrowand peripheral blood CD34+ cells using their Isolex selection devices.].CD109 is a marker of the earliest candidate hematopoietic stem andprogenitor cells currently identifiable in humans. These primitive cellsare capable of long-term hematopoietic reconstitution in vivo andtherefore are ideal for bone marrow transplantation. In addition, CD109expressing hematopoietic stem cells are highly suitable for a variety ofgene therapy related ex vivo manipulations prior to bone marrowtransplantation. In particular, this early population of cells isideally suited for gene therapy applications involving long termexpression of foreign DNA for the lifetime of the individual.

Purified CD109+ cells are thus useful for treatment of diseasesrequiring blood stem cell transplantation, including, but not restrictedto, (i) hematopoietic malignancies, including leukemia, myelodysplasia,lymphoma, and myeloma; (ii) non-hematopoietic malignancies includingbreast cancer, malignant melanoma, and renal cell carcinoma; (iii)acquired hematopoietic diseases such as aplastic anemia and PNH[paroxysmal nocturnal hemoglobinuria]; (iv) a spectrum of acquireddisorders characterised by autoimmunity, including SLE and rheumatoidarthritis; and (v) a variety of congenital, familial, and acquiredconditions treatable by hematopoietic stem cell mediated gene therapy,including, but not restricted to the congenital anemias, thalassemias,and hemophilias, and other conditions such as Gaucher's disease.

Diagnostic Test

In many diseases, CD109 is aberrantly expressed or is mutated. Detectionof CD109 expression is a useful screening tool for the presence ofdisease or to monitor its progression. For example, CD109 transcriptsmay be analyzed by DNA sequencing, SSCP analysis, or RFLP analysis, todetermine if a CD109 mutation is present. Levels of CD109 expression mayalso be measured to determine whether CD109 expression is increased ordecreased. The presence of a CD109 mutation, or of increased ordecreased levels of CD109 expression are indicative of a disease state.

CD109 is a marker of platelet and T-cell activation. CD109 exists asboth a cell surface (K1 [SEQ ID NO:2 and 4]) and a soluble (K15 [SEQ IDNO:10 and 12]) plasma molecule. As it is anchored to the cell membraneby a GPI linkage, the membrane bound (K1 [SEQ ID NO:2 and 4]) form ofCD109 can be cleaved from the cell surface with phosphatidyl-specificphospholipase C (PI-PLC). Thus, conditions of platelet, endothelial, andT cell activation are associated both with increased levels ofmembrane-associated CD109, as well as with increased levels of solubleplasma CD109. The latter is derived from two sources: K15-type solubleCD109 secreted by cells following activation, and K1-type CD109 releasedfrom the cell surface by PI-PLC cleavage. Conditions associated withendothelial cell, platelet, or T cell activation, are associated withincreased levels of membrane associated and soluble CD109. In addition,increased levels of membrane associated and soluble CD109 are found insubjects that are at risk of developing such diseases. Thus, themeasurement of cell associated and soluble CD109 is used in thediagnosis of such disorders. The measurement of cell associated andsoluble CD109 is also used to follow therapeutic response in patientswith such disorders. The measurement of cell associated and solubleCD109 is also used to identify patients at risk of developing suchdisorders prior to the development of overt disease. The measurement ofcell associated and soluble CD109 is also used to follow and assess thesuccess of interventional disease preventive strategies in such patientsat risk. The invention includes a method for assessing the levels ofsoluble and membrane associated CD109 in a subject comprising thefollowing steps: (a) preparing a blood sample (such as whole blood orplasma, or purified blood cells) from a blood specimen collected fromthe subject; (b) testing for the presence of soluble CD109, and/or ofmembrane associated CD109 in the sample; and (c) correlating thepresence or levels of CD109 in the sample with the presence (or risk) ofdisease such as cardiovascular disease in the subject.

Such measurement of soluble and membrane associated CD109 is useful indegenerative diseases, disorders or abnormal physical states associatedwith, but not limited to, conditions associated with endothelialactivation, platelet activation, activation of the coagulation orfibrinolytic systems, and activation of T lymphocytes and of thecomplement system. Specifically, such disorders may include (but are notlimited to) cardiovascular disorders including stroke, myocardialinfarction, thrombosis, and embolism, and peripheral vascular disease;disorders associated with quantitative or qualitative abnormalities ofplatelet function, including thrombocytopenia, thrombocythemia, andconditions associated with increased or impaired platelet aggregationand activation; conditions associated with increased or impairedactivation of the coagulation and/or fibrinolytic systems; andconditions associated with impaired or increased immune activation,including autoimmune diseases as well as organ and bone marrowtransplantation. Measurement of soluble and membrane-associated CD109may be used not only for the diagnosis of such conditions, but also tomonitor therapeutic response, assess prognosis, assess patient diseaserisk, and to monitor the success of disease preventative interventionsin patients at risk.

Cell surface CD109 can be measured by several well-known methods thatinclude, but are not restricted to, Flow Cytometry and Radioimmunoassay:

a. Flow Cytometry

In one example, if activated T-lymphocytes (that express CD109consequent to the activation process) are required, an anti-coagulatedwhole blood sample is collected from a healthy volunteer. In one examplea fraction enriched in mononuclear cells is prepared by density gradientcentrifugation using ficoll-hypaque density gradient centrifugation.Cells are then placed in culture using standard media supplemented withphytohemagglutinin (10 mcg/ml, Difco Detroit Mich.). After 24-36 hr,cells are washed and stained with a CD109-specific monoclonal antibodyconjugated to the fluorescent dye phycoerythrin (CD109 PE) and with aCD3 specific antibody conjugated to fluorescein isothiocyanate (CD3FITC; this stains all T cells). On a fluorescence activated flowcytometer, T cells and activated T lymphoblasts are then gated based onCD3 staining and CD3+ cells are analysed for CD109 PE staining. Theamount of cell surface CD109 corresponds to the CD109-specificfluorescence per cell.

Similar methods are be used to sort CD109+ activated platelets. In onevariation, fresh human blood is mixed 9:1 with 3.8% sodium citrateanti-coagulant. Platelet rich plasma is prepared by centrifugation ofthe citrated blood at 160×g for 15 minutes at 23EC. The plasma isdecanted and incubated with 1 mmol/L prostaglandin 12 and 1 mmol/Lacetylsalicylic acid for 30 min at 37EC. Gel-filtered platelets areprepared by passing 7 ml plasma over a 50 ml column of sepharose 2B(Pharmacia) pre-equilibrated with hepes-tyrode buffer (HTB) at pH 7.4.Fresh gel-filtered platelets are activated by addition of thrombin to afinal concentration of 0.2 U/ml. Activated and non-activated plateletsare analyzed by flow cytometry after staining with fluorescentlyconjugated CD109 antibodies as above.

b. Radioimmunoassay

Gel-filtered platelets or other CD109 expressing cells are incubatedwith varying concentrations of a CD109 antibody mixed with a fixedamount of radioiodinated CD109 antibody. After 20 min of incubation, thecell bound and unbound antibodies are separated by centrifugation andcounted on a gamma counter. The data is analyzed using the method ofScatchard, which gives a direct indication of the number of bindingsites present on the target cell for CD109.

Soluble CD109, in contrast, can be measured by a variety of obviousimmunological methods including, but not restricted to, theenzyme-linked immunosorbent assay (ELISA), which has been described inmany variant forms. Typically, a “capture” monoclonal antibody (such asantibody 8A3 in the case of CD109) is immobilized on the surface ofmultiple wells of a plate or dish. After washing with buffer andblocking (with Bovine Serum Albumin [BSA], for example), serialdilutions of sample containing CD109 (plasma, serum, or cellsupernatants, for example) are added. The CD109 in solution will becomebound to the immobilized capture antibody. After incubation and washing,a “detection” antibody (such as antibody 1B3 in the case of CD109) willbe added to each well and incubated. After removal of unbound captureantibody by washing, capture antibody binding will be quantifiedspectrophotometrically following the addition of an enzyme-linked(alkaline phosphatase, for example) goat-anti-mouse antibody reagent.The amount of soluble antigen can then be determined by comparison ofthe signal obtained from various sample dilutions with that derived froma standard curve established using serial dilutions of knownconcentrations of native or recombinant CD109. Variations of thisgeneral approach use different “capture” antibody immobilizationtechniques and different “detection” antibody signal detectiontechniques.

Kits

The invention also includes a kit for conferring increased CD109activity to a host cell including a nucleic acid molecule of theinvention (preferably in a composition of the invention) and preferablyreagents for transforming the host cell.

The invention also includes a kit for detecting the presence of CD109nucleic acid molecule (e.g. a molecule in the figures), comprising atleast one probe of the invention. Kits may be prepared according toknown techniques, for example, see U.S. Pat. Nos. 5,851,788 and5,750,653. The kit preferably includes an antibody, a medium suitablefor the formation of an immunological complex between the antibody and apolypeptide recognized by the antibody and a reagent capable ofdetecting the immunological complex to ascertain the presence of CD109or a similar polypeptide in a biological sample. Further background onthe use of antibodies is provided, for example in U.S. Pat. Nos.5,695,931 and 5,837,472 which are incorporated by reference in theirentirety.

Screening for Agonists and Antagonists of CD109 Nucleic Add Molecule andEnhancers and Inhibitors of CD109 Polypeptide.

Inhibitors are preferably directed towards specific domains of CD109 toblock CD109 activation. To achieve specificity, inhibitors should targetthe unique sequences of CD109. For example, they could block thethioester signature sequence or the defining hexapeptide motif of CD109.A similar approach can be used to search for compounds that may enhanceCD109 activation.

Screening for Agonists and Antagonists of CD109 Nucleic Acid Moleculeand Enhancers and Inhibitors of CD109 Polypeptide

Inhibitors are preferably directed towards specific domains of CD109 toblock CD109 activation or substrate binding. To achieve specificity,inhibitors should target the unique sequences of CD109 in Table 1a and1b. A similar approach can be used to search for compounds that enhanceCD109 activation.

A method of identifying a compound which modulates the activity ofCD109, can include:

a) contacting (i) CD109, a fragment of CD109 or a derivative of eitherof the foregoing with (ii) a CD109-binding carbohydrate or proteincontaining substrate (such as a mammalian, preferably human, cellmembrane) in the presence of the compound; and

b) determining whether the interaction between (i) and (ii) ismodulated, thereby indicating that the compound modulates theinteraction of CD109 and the substrate. The method preferably involvesdetermining whether the compound increases or decreases proteolyticcleavage of CD109 (and thus the activation of its thioester). Onepreferably determines whether the compound causes CD109 to become moreor less reactive towards nucleophiles and more or less likely to formester or amide bonds with the substrate.

Modulation can include increasing or decreasing the interaction between(i) and (ii). A CD109 inhibitor inhibits the interaction between (i) and(ii) while an enhancer increases the interaction.

The method preferably includes identifying a compound that blocks thebait region of CD109. The method may alternatively include identifying acompound that interferes with a CD109 domain involved in targeting thepolypeptide to the plasma membrane. The method may alternatively includeidentifying a compound that interferes with the thioester signaturesequence or thioester reactivity defining hexapeptide motif. A similarapproach can be used to search for compounds that may enhance CD109activation.

In a preferred embodiment, the invention includes an assay forevaluating whether test compounds are capable of acting as agonists orantagonists for CD109, or a polypeptide having CD109 functionalactivity, including culturing cells containing DNA which expressesCD109, or a polypeptide having CD109 activity so that the culturing iscarried out in the presence of at least one compound whose ability tomodulate CD109 activity is sought to be determined and thereaftermonitoring the cells for either an increase or decrease in the level ofCD109 or CD109 activity.

Other assays (as well as variations of the above assay) will be apparentfrom the description of this invention and techniques such as thosedisclosed in U.S. Pat. No. 5,851,788, 5,736,337 and 5,767,075 which areincorporated by reference in their entirety. For example, the testcompound levels may be either fixed or increase.

Isolation of a CD109 cDNA

The restricted pattern of expression of CD109 within hematopoieticcells—CD109 is expressed by a subset of early progenitor and candidateHSCs, and by activated platelets and T cells, but not by their restingcounterparts—shows that it plays a role in hematopoiesis, and incell-mediated immunity and in hemostasis. We have used animmuno-purification/microsequencing strategy to isolate a human CD109cDNA. Several lines of evidence show that this cDNA has been correctlyidentified: Not only did this clone encode 16 of 17 CD109-specificpeptides originally identified by immunoaffinity purification of CD109with the antibody 8A3, but expression of this cDNA resulted in theexpression of a protein that could be detected by CD109-specific mAbs,both in vitro and in vivo. However, the presence of multiple CD109transcripts by Northern analysis, as well as the presence of anadditional CD109-related peptide that cannot be accounted for by ourcDNA, shows that there exist additional or alternative CD109 variants aswell.

CD109 is a Novel Member of the α2M/C3, C4, C5 Family ofThioester-Containing Proteins

Consistent with the known size and biochemical features of CD109, cDNAclone K1 encodes a 1445 aa protein containing multiple N-linkedglycosylation sites, an amino-terminal leader peptide, and a consensusC-terminal GPI anchor cleavage/addition site. And by virtue of highsequence similarity throughout the entire molecule, and in particular,the presence of a typical thioester motif, CD109 is defined as a newmember of the α2M/C3, C4, C5 superfamily of thioester-containingproteins⁵⁰.

The α2M/C3, C4, C5 family comprises two general divisions—the α2M-likeprotease inhibitors, and the complement proteins—that are believed tohave arisen from a common, ancestral α2M-like molecule.⁵⁰ By sequencesimilarity, CD109 is closely related to the α2M inhibitors, and muchmore distantly to C3 and C4 complement proteins. The overallorganization and size of CD109 is more typical of α2M inhibitors aswell: A unique bait region (residues ˜651-683) with no homology to knownproteins, lies in the middle of the ˜162 kD chain, and a typicalthioester motif (residues 918-924) lies about two-thirds of the wayalong the molecule. In addition, a hexapeptide motif (residues1030-1035) that defines the chemical reactivity of the thioester (byprotein folding, this domain interacts with, and modulates thereactivity of the thioester) is found, as expected, about 100 aa furtherdownstream.

CD109 differs from typical α2M inhibitors in several respects, howeverFirst, while most α2M protease inhibitors exist as oligomers of a ˜180kD subunit⁵⁰ (for example, human α2 macroglobulin occurs in plasma as a˜720 kD tetramer), CD109 exists as a monomer.^(3,5,6) To date, monomericα2M protease inhibitors have been characterized primarily in rodents⁵⁰,although they are believed to exist in other vertebrates as well.⁵⁴Second, CD109 is membrane-bound via a GPI anchor. Membrane bound α2M/C3,C4, C5 proteins have not been described previously. Third, while avariety of activated human and rodent α2M inhibitors have been shown tointeract with two cellular receptors—the low density lipoproteinreceptor-related protein/α2M receptor (LRP-α2MR) on macrophages (and avariety of other cell types) that mediates the clearance ofinhibitor/protease complexes from the circulation⁵⁶ and theα2-macroglobulin signalling receptor (α2MSR) that is coupled to apertussis toxin-insensitive G protein, thereby mediating a variety ofα2M activation-dependent signals⁵⁷⁻⁶⁶—the carboxyl end of CD109 does notcontain the KPTVK motif^(59,67-75) required for receptor binding. Andfourth, although CD109 bears much greater overall sequence similarity toα2M proteins than it does to complement, its reactivity-defininghexapeptide does not end in LLN as in other α2M proteins, but ratherends in the ¹⁰³³VIH¹⁰³⁵ triplet characteristic of complement proteins(see below). Overall, therefore, CD109 defines a new member of the α2Mfamily, but one with unusual features.

The Thioester Specificity of Activated CD109 Likely Resembles That ofComplement C3 and C4

The defining structural feature of this family—the intrachain thioesterbond—is typically unreactive in the native molecule (except with smallnucleophiles such as methylamine). Upon proteolytic cleavage of themolecule, however—by specific activating enzymes in the case of thecomplement proteins, or by a wide range of proteases in the case of theprotease inhibitors—a conformational change occurs, and the thioesterbecomes highly reactive towards nucleophiles, such that the proteinsbecome covalently bound to nearby macromolecules via ester or amidebonds.⁵⁰ In the case of complement, this leads to the covalentdeposition of C3 and C4 on the target cell and on immune aggregates. Inthe case of the protease inhibitors, covalent binding of the activatingprotease may similarly occur. CD109 becomes activated by proteolyticcleavage as well.

Curiously, while α2M proteins preferentially bind to substrate moleculesby forming ester bonds with hydroxyl groups on carbohydrates orproteins, C3 and C4 generally form amide bonds with proteins^(50,78). Asalluded above, this differential specificity is determined by thepresence or absence of His or Asn in the terminal position of aconserved hexapeptide lying ˜100 amino acids C-terminal to the thioesterbond^(77,78) (by protein folding, this domain interacts with, andmodulates the reactivity of the thioester⁷⁹). In the case of thecomplement proteins, this hexapeptide usually ends in a V-I-H triplet;the corresponding α2M inhibitor triplet is usually L-L-N. As hasrecently been elucidated,^(50,80,81) proteolytically activatedAsn-containing molecules undergo direct nucleophilic attack of thethioester carbonyl in an uncatalyzed reaction. In contrast, proteolyticactivation of His-containing molecules results in a catalysedtransacylation reaction that involves the initial intramoleculartransacylation of the thioester carbonyl to the imidazole ring of His,forming a covalent intramolecular acyl imidazole intermediate. Theliberated thioester cysteinyl sulfhydryl then acts as a general base todeprotonate hydroxyl nucleophiles for attack on the acyl imidazoleintermediate. The catalysed (His) reaction thus facilitatestransacylation to hydroxyl-containing carbohydrate or protein targets,while in the uncatalyzed (Asn) reaction, only primary amine groups ofprotein targets are sufficiently nucleophilic to attack the thioesterbond directly. In addition, the t_(1/2) of the reactive thioester isknown to be much shorter if His is substituted for Asn, as theintermediate in the catalyzed reaction will react quickly with the mostcommon hydroxyl-containing nucleophile, water.^(50,80-82) Thus, α2Mproteins bearing a carboxyl-terminal His residue can react with bothcarbohydrate and protein targets, but by virtue of the short half-lifeof the reactive thioester, such binding is believed to be tightlyrestricted spatially to the initial site of activation. As the CD109regulatory motif ends in the VIH triplet usually associated withcomplement, it is likely not only that proteolytically activated CD109forms ester bonds with hydroxyl groups on carbohydrates or proteins,rather than amide bonds, but also that this reactivity is short livedand highly restricted spatially to the site of activation, definingactivated CD109 as a locally-acting molecule.

Mechanism(s) of Action of CD109

The identification of CD109 as a novel, monomeric α2M-type inhibitorwith “complement-like” thioester reactivity, allows several functionalconclusions:

CD109 becomes activated by proteolytic cleavage. While complementproteins are cleaved by specific activating proteases during complementactivation, α2M inhibitors are able to interact with a variety ofproteases. Indeed, α2M inhibitors have the unique ability to inhibitproteinases of all four mechanistic classes.^(50,83) This promiscuousactivity is defined by the “bait region”, a stretch of ˜30 amino acidsunique to each α2M inhibitor, and that lies about half way along theprotein, and contains cleavage sites for proteases of all mechanisticclasses and with diverse specificities. Thus, typical α2M inhibitors donot function as specific inhibitors, but rather are believed to inhibitproteases for which no specific inhibitors are present, and proteases,which are released in excess of their specific inhibitors.

Second, proteolytic cleavage of CD109 results in a conformational changethat leads to the covalent crosslinking CD109 to adjacent molecules.Such covalent crosslinking activity is essential for the action of C3and C4 (leading to complement deposition on immune complexes and cellmembranes), as well as for the protease inhibitory action of monomericα2M inhibitors (leading to the binding/inhibition of the activatingprotease). In contrast, while activated multimeric α2M proteins arecapable of covalent binding as well, this activity is not essential forprotease inhibition. Rather, multimeric α2M inhibitors are believed to“entrap” proteases non-covalently after undergoing a bait regioncleavage-induced conformational change, such that the protease isprevented from accessing substrates in a size-restrictedmanner^(50,51,83). CD109 functions as a protease inhibitor, and thisactivity would likely require covalent binding to the activatingprotease, resulting in the formation of a CD109/protease complex.

Third, the covalent binding of activated CD109 may not be restricted toproteases. Proteolytically activated α2M is able to bind other proteinsand peptides as well, by both non-covalent “trapping” and by covalentthioester-mediated binding mechanisms. These interactions are believedto regulate the plasma stability, transport, and clearance of a varietyof molecules including small proteases,⁸⁴ other proteins includingantimicrobial defensins,⁸⁵ apolipoprotein E,⁸⁶ and amyloid beta⁸⁷⁻⁹⁰(indeed, specific allelic variants of α2M are reportedly associated withAlzheimer's disease^(91,92)), a wide spectrum of hormones, growthfactors, and cytokines (including insulin, TNFα, IL-1β, IL-2, IL-6,IL-8, bFGF, βNGF, PDGF, TGFβ, VEGF, EGF⁹³⁻⁹⁷ and leptin⁹⁸), as well as avariety of peptide antigens.^(96,99-102) While proteolytically activatedmonomeric α2M family proteins such as CD109 would not be expected to beable to “trap” other molecules non-covalently as are their multimericcounterparts, they likely are capable of covalent binding to othernon-protease substrates. Indeed, in view of the “complement-like”spatially restricted thioester reactivity of activated CD109, and itslocation as a cell surface molecule, that such non-protease substratesshould include cell membranes.

And fourth, activated CD109/substrate complexes would not be expected tointeract with LRP-α2MR and α2MSR receptors in the manner of other α2Mfamily proteins. Thus, the route(s) by which putative activatedCD109/substrate complexes released from the cell surface by GPI anchorcleavage, would be cleared from the circulation, remain(s) undefined.Similarly, it is unlikely that such complexes would signal via α2MSR. Itis possible, however, that GPI-anchored substrate/CD109 complexes couldbe internalized directly, and could transduce intracellular signals, ashas been described for a variety of other GPI-linked proteins¹⁰³⁻¹⁰⁸.

Biological Role(s) of CD109

CD109 is a marker of specific hematopoietic stem/progenitor cellsubsets, and a target antigen in alloimmune platelet destruction. Theidentification of CD109 as an atypical member of the α2M/C3, C4, C5protein superfamily indicates a role for CD109 in hematopoiesis and inplatelet and T cell activation. Following proteolytic cleavage, CD109becomes capable of covalent binding to carbohydrate and proteincontaining substrates, including cell membranes. These interactionsfunction to modulate and regulate these cellular processes. For example,CD109 functions as a locally acting membrane-bound α2M-typeantiprotease.

Immunoaffinity Purification and Partial Amino Acid Sequencing of CD109

When analysed by SDS-PAGE/silver staining, the eluate from the mAb8A3/Protein A Sepharose column yielded two ˜150 and 170 kD bandscharacteristic of CD109^(3,15,16). Previous data have indicated that the˜150 kD form is likely derived proteolytically from the ˜170 kDspecies^(3,6). The eluate was subsequently purified further bypreparative SDS-PAGE, and the larger band was excised and digested withendoproteinase Lys-C or Asp-N. Purification and sequence analysis of theresultant peptide fragments yielded 20 peptide sequences ranging in sizefrom 7-20 amino acids (aa). As shown in Table 1, after overlappingsequences were combined, 17 independent CD109-derived peptide sequenceswere obtained.

cDNA Cloning and Analysis

BLAST analysis^(36,37) using these 17 CD109 peptide sequences identifieda rat incisor EST—R47123³⁸—that encoded the CD109-specific peptidefr76/60. A ˜4 kb EcoRI/XhoI fragment from R47123 was subsequently usedto probe a λ phage Uni-ZAP human umbilical vein endothelial cell (HUVEC)cDNA library (Stratagene), yielding 8 independent clones that wererescued into pBS SK⁺ by in vivo excision. Restriction enzyme analysisand confirmatory DNA sequencing assigned these clones to two overlappinggroups: The first comprised 7 clones that were progressive 5′truncations of the longest example—clone H6—a˜3 kb clone containing a˜2.7 kb open reading frame (ORF), followed by a 300 bp 3′ untranslatedregion (UTR) ending with a poly(A) tract. The second—consisting of cloneH7 (˜2 kb)—was contiguous with the H6 series cDNAs, but contained alonger 3′UTR that extended an additional 1,132 bp prior to theappearance of a poly(A) tract. As clone H6 was not full-length, weendeavored unsuccessfully to obtain additional more 5′ CD109 cDNAsequence by rescreening the HUVEC library with H6 itself. The screeningof a variety of additional commercial cDNA libraries was similarlyunrewarding.

Two human erythroleukemia (HEL) cDNA libraries—L12 and L13—were screenedby nested PCR using two adjacent CD109-specific antisenseoligonucleotide primers lying at the 5′ end of clone H6, together withlibrary-specific hEF-1 (sense) and supF (antisense) primers. Thisapproach yielded one PCR product (clone E2) that overlapped with H6 andextended the CD109 sequence 450 bp in the 5′ direction. Rescreening ofL12 and L13 using more 5′ clone E2-derived antisense oligonucleotidesdid not yield additional sequence.

Finally, a new ZAP Express KG1a cDNA library was screened using a cloneH6 probe, yielding 9 independent clones that were rescued into pBK-CMVby in vivo excision. Restriction enzyme analysis and confirmatory DNAsequencing demonstrated that these clones comprised a series ofprogressive 5′ deletions of the longest example—clone K1 [SEQ ID NO:1and 3] (˜4.7 kb)—that encompassed clones H6 and E2 in their entirety,and additionally contained ˜1.3 kb of additional 5′ cDNA sequence.

The nucleotide sequences of the 4 overlapping clones H6, H7, E2, and K1were determined in their entirety for both strands, and were found to bein agreement in all cases. Clone K1 [SEQ ID NO:1 and 3] contains a 4,335bp ORF flanked by 112 bp 5′ and 300 bp 3′ UTRs, respectively. Theputative translation start, while not comprising an optimal Kozakconsensus sequence, is preceded by stop codons in all three frames (Inaddition [see below], the first 20 codons downstream of this start arepredicted to encode a cleavable signal peptide.) The K1 3′ UTR containsa canonical polyadenylation signal—AATAAA—15 bp upstream of the poly(A)tail. The clone H7 3′ UTR is contiguous with that of clones K1 and H6,but extends an additional 1,132 bp in the 5′ direction. Twopolyadenylation signals are found 34 and 19 bp, respectively, upstreamof the H7 poly(A) tail.

Clone K1 Encodes CD109

To show that cDNA clone K1 [SEQ ID NO:1 and 3] did encode CD109, weinitially examined the predicted K1 [SEQ ID NO:2 and 4] proteinsequence: Notably, the clone K1 ORF was found to contain 16 of the 17CD109-derived peptide sequences described above. Next, we confirmed thatthe protein encoded by K1 [SEQ ID NO:2 and 4] could be detected byCD109-specific mAbs. When transcribed and translated in vitro, K1yielded a protein of ˜160 kD that was recognized by mAb 1B3¹⁵. Inaddition, stable expression of done K1 was able to confer mAb 8A3binding to transfected CHO cells. Clone K1 represents a CD109 cDNA.

CD109 is Predicted to be a GPI-Linked Thioester-Containing Protein

Consistent with the known size of CD109, the translated K1 sequence(FIG. 3A) predicts a 1445 aa protein of ˜162 kD bearing a cleavable 21aa N-terminal leader peptide⁴² and containing 17 N-linked glycosylationsites. And as expected, the presence of a C-terminal hydrophobic tailpreceded by a short hydrophilic stretch and a cluster of non-bulky aminoacids defines a GPI anchor cleavage/addition site, with cleavagepredicted to occur after aa 1420⁴³⁻⁴⁵. Notably, the presence of athioester signature sequence^(46,47)—⁹¹⁸PYGCGEQ⁹²⁴—defines CD109 as amember of the α2 macroglobulin (α2M)/C3, C4, C5 superfamily ofthioester-containing protease inhibitor and complement proteins. Indeed,as assessed by Blast^(36,37) analysis, CD109 bears ˜45-50% overallsequence similarity to other vertebrate and invertebrate α2M proteins(and is more distantly related to vertebrate complement C3 and C4proteins), and shares the overall domain structure of the α2M family,with particularly high similarity in the region of the thioester and in11 additional α2M family-specific conserved sequence blocks^(48,49),including a hexapeptide motif⁵⁰ (residues 1030-1035) that lies ˜100 aadownstream of the thioester and is believed to define its chemicalreactivity (see below). In addition to these highly conserved regions,each α2M protein also contains a unique “bait region” that definessubstrate specificity⁵⁰. Consistent with this, CD109 also contains aputative “bait region” (˜ residues 651-683; FIGS. 3A and 5A) that, asexpected, is unrelated to the corresponding regions of other familymembers.

Native CD109 Contains an Intact Thioester

The defining structural feature of the α2M/C3, C4, C5 family—anintrachain thioester bond—is typically situated about two thirds of theway along the pro-molecule. In the native molecule, this bond—formedbetween a cysteinyl side chain sulfhydryl and a glutamine side chaincarbonyl in the sequence CGEQ—is unreactive, except with smallnucleophiles such as methylamine, which can therefore be used to disruptthe thioester^(50,51).

In addition, under experimental conditions of heat or chemicaldenaturation (preparing a sample for SDS-PAGE, for example), bothcomplement and α2M inhibitors may undergo internal nucleophilic attackon the thioester, resulting in the autolytic cleavage of theprotein^(52,53). Although not of physiological significance, thisautolytic reaction is useful diagnostically to indicate the presence ofan intact thioester bond. We showed whether native CD109 could undergohigh temperature autolytic cleavage that could be prevented bypretreatment with methylamine. Thioester-dependent cleavage would beabrogated by disruption of the thioester bond with methylamine. WhenKG1a-derived mAb 8A3/CD109 immune complexes were treated with 400 mMmethylamine prior to boiling, only a single 170 kD CD109 band wassubsequently observed. In the absence of methylamine treatment, however,boiling resulted in the appearance of the typical 150 kform, and of anassociated ˜20 kD fragment. In contrast, and consistent with the knowninability of standard cell-free systems to support intramolecularthioester formation (XX), only a single band was observed when CD109synthesized in vitro was boiled in the absence of methylamine. Theseobservations demonstrate that native CD109 does indeed contain an intactthioester. The 150 kD CD109 band is derived from the 170 kD form byautolytic rather than by proteolytic cleavage.

Expression Pattern of CD109

As noted above, two distinct CD109 3′ UTRs were isolated by libraryscreening. To confirm that both cDNA variants were produced, and todiscern the relative prevalence of these two forms, we assessed theirexpression in KG1a cells by semi-quantitative RT-PCR. Both variants weredetected readily in KG1a RNA. Consistent with these data, multiple KG1aCD109 transcripts were detectable as well by Northern analysis. Whilethe smallest ˜5.4 kb transcript may correspond to the K1 cDNA, theidentity of the two larger transcripts and their relationship to the H7cDNA remain obscure. In any case, the existence of multiple CD109transcripts shows that there exist additional CD109 variants that havenot yet been identified. All CD109 variants are included within thescope of this invention. A CD109 probe was also used to evaluate thetissue range of expression of CD109 using a commercial multiple tissueblot bearing a series of human adult and fetal RNAs. CD109 transcriptswere detected in a wide range of tissues, with highest levels beingfound in adult uterus, aorta, heart, lung, trachea, and placenta, and infetal heart, kidney, liver, spleen and lung. Whether these data indicatetrue widespread expression, or merely reflect expression in endothelialcells (or both) is not known.

Chromosomal Mapping of the CD109 Locus

Using cDNA clone K1 [SEQ ID:NO1] as a probe, two positive genomic PACclones—94J24 and 4L10—were identified. The chromosomal assignment ofeach of these clones was then determined by FISH analysis of 20well-spread metaphases. Both PACs mapped to 6q12-13, with positivehybridization signals being observed in >90% of the cells, and on bothhomologues in >90% of the positive spreads. This chromosomal locationwas then confirmed by radiation hybrid mapping: Using both the GeneBridge 4 and the Stanford G3 panels and a 3′UTR PCR probe, CD109 wasmapped to within 11.09 cR and 6.9 cR of framework markers CHLC.GATA11F10 and SHGC-33186, respectively—a region that corresponds to 6q13.

Materials and Methods

Cell Culture

KG1a acute myeloblastic leukemia cells^(1,2) were grown in RPMI 1640medium supplemented with 10% heat-inactivated fetal calf serum(Hyclone), 2 mM L-glutamine, 100 U/ml penicillin, and 100 μg/mlstreptomycin (Gibco Life Technologies). Chinese Hamster Ovary (CHO)cells were grown in F-12 (Ham) Nutrient Mixture (Gibco LifeTechnologies) supplemented as above, while 293 cells were cultured insimilarly supplemented Dulbecco's Modified Eagle Medium (DMEM; GibcoLife Technologies). All cells were initially obtained from ATCC and weremaintained at 37 C in a humidified atmosphere containing 5% CO₂.

Antibodies

CD109 MAbs 8A3, 7D1, 7C5 and 8A1 were raised against KG1a cells asdescribed previously.³ CD109 antibody 1B3¹⁵ Other CD109 antibodies usedin this study were obtained through the Endothelial Panel of the Vth andVIth HLDA Leukocyte Typing workshops.^(5,6) The mAbs D51²⁸ (gift ofH.-J. Gross) and KC4²⁹, which recognize CD71 and CD62P respectively,were used as immunoprecipitation and affinity purification controls, asappropriate.

Immunoaffinity Purification and Partial Amino Acid Sequencing of CD109

1×10¹⁰ KG1a cells were pelleted, washed three times in ice coldphosphate buffered saline (PBS; 140 mM NaCl, 2.7 mM KCl, 10 mM Na₂HPO₄,1.8 mM KH₂PO₄) pH 7.4 supplemented with 0.2 mM EDTA. Pelleted cells werethen resuspended vigorously in 300 ml ice cold lysis buffer (0.01MTris-HCl, pH 8.1, 0.15 M NaCl and 0.5% Nonidet P-40[NP40]) freshlysupplemented with 2 mM PMSF, 1 mg/ml BSA, and 2 mM EDTA (Sigma) asdescribed previously^(30,31), and were kept on ice for 20 minutes withvigorous stirring. The lysate was clarified by centrifugation at100,000×g for 30 minutes, and the supernatant was decanted and broughtto 0.5 M NaCl. CD109 antigen was separated from the crude cell lysate bycross-linked immuno-affinity chromatography. Briefly, 10 mg of CD109monoclonal antibody 8A3 were bound to 1 ml of Protein A Sepharose beads(Pharmacia) and linked covalently using the homobifunctional crosslinkerdimethyl pimelimidate (Pierce). After loading the lysate, the column waswashed extensively with lysis buffer containing 0.5 M NaCl, followed bylysis buffer containing 0.1% SDS. Bound CD109 was subsequently elutedwith 0.05 M diethylamine (pH 11.5) containing 0.5% deoxycholate (DOC).The eluted material was adjusted to pH 8.1 by the careful addition of0.1 N HCl^(31,32) and its purity was assessed by SDS-PAGE and silverstaining.³³

The eluted CD109 preparation was further fractionated by preparative7.5% SDS-PAGE, and the major band at 170 kD was visualized with RG 250Coomassie blue. The band was excised and was digested overnight witheither endoproteinase Lys-C or Asp-N (Boehringer). The resultantpeptides were extracted and separated by tandem ion exchange and reversephase chromatography. Fractions were collected manually into coatedtubes, and were sequenced with an Applied Biosystems Procise sequencerusing rapid cycle chemistry.³⁴

Labeling of Cells and Immunoprecipitation of CD109

KG1a cells were labeled by lactoperoxidase-catalyzed radio-iodination asdescribed previously³⁰, or by biotinylation using the water-solublebiotin derivative Sulfo-NHS-LC-LC-Biotin (Pierce). In the latter case,2.5×10⁷ KG1a cells were washed three times in ice cold PBS pH 8.0 andwere resuspended in 1 ml of the same buffer at room temperature. 1 mg ofSulfo-NHS-LC-LC-Biotin was added to the cell suspension. After 20minutes at room temperature, the reaction was terminated by washing thecells three times with ice cold PBS pH 7.4 containing 2 mM EDTA.

Labeled cells were lysed in ice cold lysis buffer (0.01 M Tris-HCl, pH8.1; 0.15 M NaCl and 0.5% NP40) freshly supplemented with 2 mM PMSF, 1mg/ml BSA, and 2 mM EDTA as described previously.^(30,31) After 30minutes on ice, the lysate was clarified by centrifugation at 14,000×gfor 5 minutes. The supernatant was then decanted, brought to 0.5M NaClby the addition of a 1/10 volume of 3.5M NaCl, and used forimmunoprecipitation with specific antibodies as previouslydetailed.^(30,31) Briefly, immune complexes were collected on proteinA-Sepharose beads, or on rabbit anti-mouse immunoglobulin-coated proteinA sepharose beads as appropriate. Immune complexes were then washedtwice with lysis buffer containing 0.5M NaCl, and once with lysis buffersupplemented with 0.5% DOC/0.1% SDS.

Methylamine Treatment of Immunoprecipitates

Immunoprecipitates prepared as above were washed further with 0.2 MHepes, pH 8/0.1% NP40 and were then resuspended in 1 ml fresh 0.4 Mmethylamine (Sigma)/0.2 M Hepes, pH 8/0.1% NP40. After 30 minutes at 37°C., methylamine-treated immunoprecipitates were washed once with 0.2 MHepes, pH 8/0.1% NP40 and once in lysis buffer containing 0.1% SDS/0.5%DOC as above, and were subsequently dissociated by incubation in SDS-gelsample buffer for 10 minutes at 37° C. Control, untreatedimmunoprecipitates were dissociated by boiling in SDS gel sample bufferfor 30 minutes.

Analysis of Immune Complexes and Western Blotting

Immune complexes containing radio-iodinated cell surface antigens wereanalyzed by SDS-PAGE, and autoradiography, while biotinylated complexeswere analyzed by SDS-PAGE and western blotting. Blots were blocked forat least 1 hour with 0.01 M Tris/0.15 M NaCl, pH 8.0 containing 5% w/vdried skim milk and developed using streptavidin-conjugated horseradishperoxidase (HRP)-coupled chemiluminescent detection (Pierce). In someexperiments involving immune complexes from non-labeled cell lysates,CD109 was detected with the CD109 antibody 1B3, that detects adenaturation-resistant epitope of CD109 followed by either iodinatedProtein A³² or HRP-conjugated goat anti-mouse Ig-coupledchemiluminescence (Pierce).

cDNA Library Screening

BLAST analysis^(36,37) identified a rat incisor EST-R47123³⁸—thatencoded a CD109-specific peptide. EST R47123 was obtained, the DNAsequence was confirmed, and a ˜4 kb EcoRI/XhoI fragment was subsequentlyradiolabeled to high specific activity with α-³²P-dCTP using the randomprimer synthesis method³⁹ and used to screen a phage Uni-ZAP humanumbilical vein endothelial cell (HUVEC) cDNA library (Stratagene):

The phage HUVEC cDNA library was plated and prepared for plaquehybridization. Duplicate supported nitrocellulose filters totalling3.5×10⁶ plaques were pre-hybridized in Quik-Hyb (Stratagene) for 4 hoursat 65° C., hybridized with the EST probe in Quik-Hyb at 65° C.overnight, washed progressively to a final stringency of 0.1× standardsaline citrate (SSC; 0.15 M NaCl, 0.015 M sodium citrate), 0.1% sodiumdodecyl sulfate (SDS), 65° C., and exposed to X-ray film at −80° C.Positive plaques were picked into 500 μl SM buffer with 20 μl chloroformand eluted overnight at 4° C. After three further rounds of plaquepurification, positive clones were rescued into pBS SK⁺ by in vivoexcision using the ExAssist/SOLR (Stratagene) system according to themanufacturer's instructions. Independent colonies were picked, grownovernight in LB-ampicillin, plasmid DNA was prepared by the alkalinelysis method, and inserts were evaluated by restriction endonucleasedigestion and DNA sequence analysis.

Two pAX142-based⁴⁰ human erythroleukemia (HEL) cDNA libraries—L12 andL13—were screened by PCR using the CD109-specific nested antisenseoligonucleotide primers 109-6-1-3 and 109-6-1-6 (which lie at the 5′ endof clone H6, the largest of the HUVEC-derived cDNA clones above) andlibrary-specific primers hEF-1 (5′-CCTCAGACAGTGGTTCAA-3′) and supF(5′-CTTCGAACCTTCGMTCC3′). Fifty μl “hot-start” PCR reactions (1× GIBCOPCR buffer, 1.5 mM MgCl2, 200 mM each dNTP (Boehringer), 1 mM eachprimer, 1.25 units Taq polymerase (GIBCO), 3 μl library cDNA) underwent35 cycles of 94×45 s, 54×45 s, and 72×45−60 s. Resultant PCR productswere cloned into pMAB1 (a pBS SK(−) derivative containing a Pmelrecognition site within the polylinker) and analysed as above.

A KG1a cDNA library⁴¹ constructed in ZAP Express was plated and 2×10⁷plaques were screened as above using a clone H6 probe that had beenradiolabeled to high specific activity with ³²P-dCTP by the randomprimer synthesis method. Positive plaques that survived tertiaryscreening were rescued into pBK-CMV by in vivo excision using theExAssist/SOLR (Stratagene) system, and resultant plasmid clones wereamplified and analysed by restriction endonuclease digestion and DNAsequence determination, as appropriate.

Restriction Endonuclease Analysis and DNA Sequencing

HUVEC- and KG1a-derived clones were digested singly (or in combination,as appropriate) with EcoRI, XhoI, KpnI, PstI, SmaI, BamHI, XbaI, NotI,and SacI, using the corresponding buffers supplied by the manufacturer(NEB). Digested samples were size-separated electrophoretically on 1%-2%agarose/TAE (40 mM Tris-acetate; 1 mM EDTA, pH 8.3) gels containing 0.5μg/ml ethidium bromide, and were then analysed visually.

The nucleotide sequences of HUVEC-, KG1a-, and HEL-derived clones weredetermined either by i.) the dideoxynucleotide chain termination methodusing ³⁵S dCTP (NEN) together with a T7 sequencing kit (AmershamPharmacia Biotech), with reactions being size-separatedelectrophoretically on a 6% polyacrylamide/5 M urea gel that was thentransferred to filter paper, dried by vacuum, and exposed to X-ray filmovernight at room temperature, or ii.) using an ABI Prism dye terminatorsequencing kit, together with a Perkin Elmer-Cetus 2400 thermocycler andan ABI Prism 310 Genetic Analyser (Perkin Elmer Applied Biosystems) assuggested by the manufacturer. Oligonucleotide primers used forsequencing reactions were synthesized by Gibco BRL and comprisedstandard T7 (5′-GTAATACGACTCACTATAGGGC-3′) and T3(5′-AATTAACCCTCACTAAAGGG-3′) primers and a series of gene-specific CD109primers. The HUVEC-derived H6 and H7, HEL-derived E2, and KG1a-derivedK1 clones (FIG. 2A) were sequenced in their entirety on both strands.

Northern Blot Hybridization Analysis

Total RNA from KG1a cells was extracted by using TRIzol (Gibco BRL LifeTechnologies) as directed by the manufacturer. Poly(A) mRNA was purifiedfrom total KG1a RNA using an Oligotex mRNA kit (QIAGEN) according to themanufacturer's protocol. In either case, the final RNA pellet wasdissolved in diethylpyrocarbonate (DEPC; Sigma)—treated water.Approximately 20 μg of total RNA and less than 1 μg of mRNA weresize-separated electrophoretically in a 1% agarose/6.7% formaldehyde inMOPS (20 mM 3-(N-morpholino)propanesulphonic, 8 mM sodium acetate, 1 mMEDTA, pH 7) gel, transferred to a Hybond N nylon membrane (AmershamPharmacia Biotech), crosslinked by baking in vacuo at 80° C. for 1 hour,and hybridized overnight in Quik-Hyb (Stratagene) with a CD109 K1 proberadiolabeled as above. The membrane was then washed to a finalstringency of 1× SSC/0.1% SDS, and exposed to X-ray film with anintensifying screen at −80° C.

To determine the expression of CD109 in various human tissues, a humanmultiple tissue RNA nylon membrane (Human RNA Master Blot, Clontech)containing poly(A) RNAs from 50 human tissues was hybridized overnightat 65° C. with the same high specific activity radiolabeled CD109 K1probe as above, but using ExpressHyb Hybridization Solution (Clontech)according to the manufacturer's instructions. Subsequently, the blot waswashed progressively to a final stringency of 0.1× SSC/0.5% SDS at 55°C. and was exposed to X-ray film with an intensifying screen at −80° C.

RT-PCR

RNA prepared as above was treated for 15 minutes at 37° C. with 3 U/μgRQ1 RNAse-free DNAse (Promega) in 50 μl of 10 mM Tris-HCl, pH 8.3; 50 mMKCl; 1.5 mM MgCl₂, additionally containing 40 units of RNAse inhibitor(RNAguard, Amersham Pharmacia Biotech). RNA was then extractedsequentially with equal volumes of phenol-chloroform and chloroformalone, and after the addition of 0.1 volume of 3 M sodium acetate, pH5.2, 20 μg of glycogen (Boehringer), and 40 U RNAse inhibitor, wasprecipitated overnight in absolute ethanol at −80° C. Aftercentrifugation, the RNA pellet was washed with RNAse-free 75% ethanol,and resuspended in RNAse-free water.

cDNA was prepared from 5 μg KG1a RNA using SuperScript I reversetranscriptase (GIBCO BRL Life Technologies) in a 20 μl reaction volumecontaining 40-60 ng random hexameric oligodeoxyribonucleotides (AmershamPharmacia Biotech) as recommended by the manufacturer, but alsocontaining 40 U RNAse inhibitor, and 5 mM dithiothreitol (DTT; Sigma).Reactions were terminated by heat inactivation at 95° C. for 5 minutes.

Reverse transcription efficiency was initially tested by PCR usingoligonucleotide primers HPRT-5′ (5′-GAGGATTTGGAAAGGGTGTT-3′) and HPRT-3′(5′-ACAATAGCTCTTCAGTCTGA-3′), which yield a 231 bp product specific tohuman hypoxanthine-guanine phosphoribosyl transferase (HPRT). Fifty μlPCR reactions (1× Gibco PCR buffer, 2 mM MgCl2, 200 mM each dNTP(Boehringer), 1 mM each primer, 1.0 units Taq polymerase (GIBCO), 2 μlreverse transcription reaction) underwent 35 cycles of “touchdown” PCR.Specifically, after a hot start at 94° C., the denaturation andextension steps remained at 94° C. and 72° C., respectively, for 60 s,but the annealing step (also 60 s) began at 65° C. for 2 cycles, andthereafter decreased in steps of 2 C/2 cycles until 47° C. The last 17cycles continued at the annealing temperature of 47° C. The finalextension was extended to 10 minutes. Finally, PCR-amplified productswere size separated electrophoretically in 1% agarose/TAE gelscontaining 0.5 μg/ml ethidium bromide, and were inspected visually.

In Vitro Transcription/Translation

The KG1a-derived CD109 K1 clone in pBK-CMV was digested with NotI andSalI to liberate a cDNA fragment containing the entire open readingframe (ORF)—including the translation initiation codon—but missing the5′ and 3′ UTRs, which was then inserted into NotI/SalI digested pBS IIKS(−) such that the CD109 cDNA was placed downstream of the T7 promoter(pBS KS II T7/K1). Restriction enzyme digestion and DNA sequencing wereused to confirm insert orientation.

The pBS KS II T7/K1 construct (1.2 μg) was then transcribed andtranslated in vitro using the T7/T3 TNT Coupled Reticulocyte LysateSystem (Promega) and ³⁵S-methionine (Amersham), in a 50 μl reactionfollowing the manufacturer's protocol.

Following completion of the TNT reaction, 5 μl of the reaction mix wereadded to 195 μl of lysis buffer (50 mM Hepes, pH 7.5; 5 mM MnCl₂, 10 mMMgCl₂, 5 mM EGTA, 2 mM EDTA, 100 mM NaCl, 5 mM KCl, and 0.1% NP-40)supplemented with 50 μg/ml aminoethylbenzenesulfonylfluoride, 1 μg/mlantipain, 1 μg/ml leupeptin, 1 μg/ml pepstatin, and 1 μg/ml aprotinin(ICN Pharmaceuticals), 1.5 μg of the monoclonal CD109 antibody 1B3 or ofthe irrelevant CD71 mAb D51 were added, and the resultant mix wasincubated on ice for 1 hour.

Protein A-Sepharose beads (Pharmacia) in lysis buffer (0.1 mg/ml) wereinitially precleared with KG1a cell lysate: 2×10⁷ KG1a cells were washedwith PBS, were lysed in 500 μl of lysis buffer on ice for 20 minutes,and the supernatant was recovered by centrifugation (1000 g, 4° C., 1minute). 100 μl of resuspended protein A-Sepharose beads weresubsequently added to supernatant, incubated on ice for 30 minutes, wererecovered by centrifugation (1000 g, 4° C., 5 minutes), and were washedtwice (500 μl) and resuspended (100 μl) in cold lysis buffer.

To recover immune complexes, 20 μl of pre-cleared protein A-Sepharosebeads were added to the TNT/antibody mixtures, rotated constantly at 4°C., for 45 minutes, and were then washed four times with 1 ml cold lysisbuffer. Finally, the beads were resuspended in 2 μl 2× SDS sample buffer(100 mM Tris-HCl, pH 6.8; 4% SDS; 0.2% bromophenol blue; 20% glycerol)containing 200 mM DTT. After boiling for 3 minutes, and centrifugation,supernatants were size-separated electrophoretically on a 6% SDS-PAGEgel, and the gel was transferred to blotting paper and dried, andfinally was exposed to X-ray film at room temperature.

Expression of CD109 cDNA in CHO Cells

The CD109 K1 cDNA ORF was excised from pBK-CMV as above, the SalIoverhangs were made blunt with Klenow DNA polymerase (NEB), and thepolished fragment was inserted downstream of the CMV promoter (butupstream of the IRES sequence) into EcoRV/NotI cut pIRES-EYFP (Clontech)to yield pK1/YFP. The orientation of the insert was subsequentlyverified by restriction enzyme analysis and DNA sequencing.

CHO cells were grown to ˜75% confluency and seeded at a density of2-3×10⁵ cells per well on a six-well plate before transfection. Fortransient expression of CD109, monolayers of CHO cells in a six-wellplate were transfected with 20 μl of cesium chloride purified pK1/YFPfusion constructs (or with empty pIRES-EYFP control vector) using 20 μlSuperFect (QIAGEN) according to the manufacturer's protocol. Briefly,the DNA/SuperFect mixtures were incubated with the CHO cells at 37°C./5% CO₂. After 3 hours, this mixture was removed, fresh medium andserum were added, and the cells were grown without drug selection foranother 24-72 hours at 37° C./5% CO₂. Cells were then lifted from theplates with citric saline (135 mM KCl, 15 mM sodium citrate), incubatedwith 3 μg PE-conjugated CD109 mAb 8A3 for 30 minutes on ice, rinsedtwice in Tris-buffered saline (TBS; 25 mM Tris-HCl (pH 8.1), 140 mMNaCl, 2.7 mM KCl), and analysed flow cytometrically on a FACScancytofluorimeter (Becton-Dickinson). Transfected CD109 expression wasdetermined by assessing 8A3 binding to EYFP positive cells.

FISH Mapping of CD109 Locus

The ˜4.5 kb CD109 cDNA (clone K1) was used to screen a P1 derivedartificial chromosome (PAC) library in the Canadian Genome Analysis andTechnologies (CGAT) Physical Mapping Resource Facility (Hospital forSick Children, Toronto). The two resultant CD109-specific pCYPAC-1clones—94J24 and 4L10—were then used for fluorescence in situhybridization (FISH) analysis of normal human lymphocyte chromosomescounterstained with propidium iodide and4′,6-diamidin-2-phenylindol-dihydrochloride (DAPI). Following probebiotinylation by nick translation, and cot-1 suppression bypreannealing, hybridization was detected with avidin-fluoresceinisothiocyanate (FITC). Images of metaphase preparations were visualizedby digital imaging microscopy using a thermoelectrically cooled chargecoupled camera (Photometrics, Tucson, Ariz.). Hybridization signals andDAPI banded chromosome images were acquired, and pseudo coloured yellow(FITC) and blue (DAPI) signals were overlaid electronically and mergedusing Adobe Photoshop 3.0 software. Chromosomal band assignment wasdetermined by measuring the fractional chromosome length and byanalysing the banding pattern generated by the DAPI counterstainedimage.

Radiation Hybrid Mapping

Two PCR primers (K1UTRs, 5′-GTCACATGTGATTGTATGTTTTCG-3′; K1UTRas,5′-GGGGAAAATATAGACACACAACTGC-3′) were designed to amplify a 189 bpfragment of the CD109 clone K1 3′ UTR. PCR reactions were carried out in25 μl reaction volumes with 25 ng of human genomic DNA; 12.5 pmol ofeach primer; 1.25 units of Taq polymerase; 200 pmol of each dNTP; 1.0 mMMgCl₂; 20 mM Tris-HCl, pH 8.4 and 50 mM KCl. A “hot start” was carriedout with the addition of Taq polymerase and dNTP after an initialdenaturation at 94° C. for 5 minutes followed by 30 cycles of 94° C.×30s, 52° C.×30 s, 72° C.×30 s, followed by a single final extension at 72°C. for 15 minutes. All reactions were carried out in a DNA Thermocycler480 (Perkin Elmer) with an overlay of two drops of mineral oil. At thecompletion of the PCR run, 5 μl of loading buffer were added to eachreaction and a 10 μl aliquot was size-separated electrophoretically on a2% agarose/TAE gel containing 0.5 μg/ml ethidium bromide, and inspectedvisually. Negative controls to check for cross contamination werenegative, as was the homology control with hamster DNA (A3).

Both the GeneBridge 4 RH panel and Stanford G3 RH panel were screenedusing the K1UTRs/as primer pair. The GeneBridge 4 RH panel controls wereHFL (human genomic DNA, positive) and A23 (hamster genomic DNA,negative), while the Stanford G3 RH panel controls were A3(non-irradiated hamster genomic DNA, negative), and RM (non-irradiatedhuman genomic DNA, positive). Panel results were scored independently.

The present invention has been described in detail and with particularreference to the preferred embodiments; however, it will be understoodby one having ordinary skill in the art that changes can be made withoutdeparting from the spirit and scope thereof. For example, where theapplication refers to proteins, it is clear that peptides andpolypeptides may often be used. Likewise, where a gene is described inthe application, it is clear that nucleic acid molecules or genefragments may often be used.

All publications (including Genbank entries), patents and patentapplications are incorporated by reference in their entirety to the sameextent as if each individual publication, patent or patent applicationwas specifically and individually indicated to be incorporated byreference in its entirety.

TABLE 3 CD109 antibodies: 8A3, 23/5F639/6C3, TEA 2/16, D2, LDA1, 7D1,40B8, 1B3, 59D6, 8A1, 7C5, B-E47 Ascites/ Super- Ab natant name refSubmitter Department Organisation Address Species Subclass Purified 8A31, 2 Robert Sutherland Oncology Research Toronto Hospital Room 407, 67College St. Toronto, M IgG2aK S/P Ontario, M5G 2M1, Canada 23/5F6 WalterKnapp Institute of University of Vienna Borschkegasse 8a, 1090 Vienna, MIgG2a ? Immunology Austria 39/6C3 Walter Knapp Institute of Universityof Vienna Borschkegasse 8a, 1090 Vienna, M IgG2a ? Immunology AustriaTEA Francisco Service Hospital de la Princesa c/Diego de Leon 62, 28006Madrid M IgG1 ? 2/16 Sanchez-Madrid Immunologia Spain D2 2, 3 RobertFinberg Division of Dana-Faber Cancer 44 Binney Street, Boston MA M IgG1A/P Infectious Diseases Institute 02115, USA LDA1 2, 4 Nicole Suclu-Dept. of Pathology Columbia University 630 West 168th Street, New York,M IgG2a A/P Foca NY 10032, USA 7D1 1, 2 Robert Sutherland OncologyResearch Toronto Hospital Room 407, 67 College St. Toronto, M IgG1K S/POntario, M5G 2M1, Canada 40B8 2 Hans-Jorg Medizinische KlinikEberhard-Karis Dept II, Otfried-Muller-Str. 10, M IgG1 P Buhring undPoliklinik University of 72076 Tubingen, Germany Tubingen 1B3 2, 5 DianeNugent Childrens Hospital of 455 Main St. Orange, M IgG2a A/P OrangeCounty CA 92668, USA 59D6 2 Hans-Jorg Medizinische Klinik Eberhard-KarisDept II, Otfried-Muller-Str. 10, M IgM S Buhring und PoliklinikUniversity of 72076 Tubingen, Germany Tubingen 8A1 1, 2 RobertSutherland Oncology Research Toronto Hospital Room 407, 67 College St.Toronto, M IgG1K S/P Ontario, M5G 2M1, Canada 7C5 1, 2 Robert SutherlandOncology Research Toronto Hospital Room 407, 67 College St. Toronto, MIgG1K S/P Ontario, M5G 2M1, Canada B- sold by EuroClone, Leinco,Diaclone M IgG1 S/P E47

-   1. Sutherland D. R., Yeo E., Ryan A., Mills G. B., Bailey D. and    Baker M. A. Blood 77 84-93 [1991].-   2. Sutherland D. R. Yeo, E. L. Cluster report: CDw109. In: Leukocyte    Typing V (ed. S. Schlossman et al.) pp. 1767-9. Oxford University    Press, Oxford. [1995].-   3. Haregewoin A., Solomon K., Hom R. C., Soman G., Bergelson J. M.,    Bhan A. K. and Finberg R. W. Cellular Immunology 156, 357-70 [1994].-   4. Suciu-Foca N., Reed E., Rubenstein P., MacKenzie W., Ng A. and    King D. W. Nature. 318, 465-7 [1985].-   5. Brashem-Stein C., Nugent D. and Bernstein I. D. Journal of    Immunology 140, 2330-3 [1988].

1. An isolated nucleic acid molecule comprising the nucleic acidmolecule of the coding strand shown in SEQ ID NO:1.
 2. The molecule ofclaim 1, wherein the molecule encodes a polypeptide including athioester region which becomes reactive towards a nucleophile when thepolypeptide is cleaved.
 3. An isolated nucleic acid molecule, comprisinga nucleic acid molecule encoding the same amino acid sequence as anucleotide sequence of claim
 1. 4. The nucleic acid molecule of claim 1,wherein the polypeptide activity comprises a K1 polypeptide SEQ ID NO:2.5. The nucleic acid molecule of claim 1, consisting of the nucleotidesequence shown in SEQ ID NO:
 1. 6. An isolated nucleic acid moleculecomprising SEQ ID NO:1 isolated from a human.
 7. The nucleic acidmolecule of claim 1, wherein the molecule comprises genomic DNA, cDNA orRNA.
 8. The nucleic acid molecule of claim 7, wherein the nucleic acidmolecule is chemically synthesized.
 9. A recombinant nucleic acidmolecule comprising the nucleic acid molecule of claim 1 and aconstitutive promoter sequence or an inducible promoter sequenceoperatively linked so that the promoter enhances transcription of thenucleic acid molecule in a host cell.
 10. A vector comprising thenucleic acid molecule of claim
 1. 11. The vector of claim 10, comprisinga promoter selected from the group consisting of a vav promoter, a H2Kpromoter, a PF4 promoter, a GP1b promoter, a lck promoter, a CD2promoter, a granzymeB promoter, a Beta actin promoter, a PGK promoter, aCMV promoter, a retroviral LTR, a metallothionenin IIA promoter, anecdysone promoter and a tetracycline inducible promoter.
 12. A host cellcomprising the recombinant nucleic acid molecule of claim 9, or progenyof the host cell.
 13. The host cell of claim 12, selected from the groupconsisting of a mammalian cell, a fungal cell, a yeast cell, a bacterialcell, a microorganism cell and a plant cell.
 14. An isolated nucleicacid molecule encoding a polypeptide comprising the same amino acidsequence as the polypeptide encoded by SEQ ID NO:1.
 15. A pharmaceuticalcomposition comprising the nucleotide sequence of claim
 1. 16. A kit forthe treatment or detection of a disease, disorder or abnormal physicalstate, comprising the nucleotide sequence of claim 1.